how to avoid Locust treat 502 failed response as miniumum response time? - locust

once my website execeeds 1000 concurrent user, the nginx simplely return 502, but locust treat this reponse as miniumum response time? how can we avoid this?

First, is it not accurate? Are the 502s not being returned immediately with that 2ms response time? If that is accurate, I'm not sure you want to mess with the data.
But that being said, you certainly can mess with the data all you'd like. You can catch the response before it's automatically reported and change the data to what you want to report to Locust. You can read about how to do that in the docs here:
https://docs.locust.io/en/stable/writing-a-locustfile.html#validating-responses
If you need even more control, you can make it not report the response automatically at all (or just make your own http requests, using the Locust client) and then you can manually fire an event with complete custom data:
https://docs.locust.io/en/stable/api.html#locust.event.Events.request_failure

Related

How to handle PUT endpoint with immutable resource

A microservice I develop exposes a single endpoint, PUT /deployments/{uuid} . The endpoint is used to initiate a potentially expensive deployment operation, so we only ever want it to happen once, which is why we chose PUT + UUID over POST (for uniqueness). The deployment is immutable, so it can never be updated, so we currently raise an exception if the PUT is called more than once with the same uuid.
As a person who loves bikeshedding and therefore cares deeply about restfulness, this grinds my gears. PUT is supposed to be idempotent, so raising an exception after making the same request multiple times is an antipattern. However, we have a requirement to not allow sequential identical requests to generate new deployments, so the usual POST is out.
While the best solution is one that works, I'd like ours to be a little more elegant, if possible. I've posited a POST with the UUID in the payload, but my team seems to think that's worse than the current solution. I'm considering just returning a 200 OK from a PUT to the same UUID rather than a 201 CREATED, but I'm not sure if that has the same problem as non-idempotent-put in not semantically conveying the information I want.
Is there a "best solution" here? Or am I doomed to be "that guy" on my team if I pursue this further (joke's on you i'm already that guy).
tl;dr What is the correct RESTful API signature for a /deployments endpoint that is immutable, and required to not allow the same request to be processed twice?
Idempotent does not mean "2 identical requests should yield the same response". It means: "The server state after 2 identical requests should be the same as when only 1 is made".
A similar example, if you call DELETE on a resource and get a 204 No Content back, and call DELETE again and get a 404, this doesn't violate the idempotency requirement. After the second delete the resource is still removed, just like it was after the first.
So multiple identical idempotent requests are allowed to give different responses.
That said though, I think it might be nicer to the second identical request to also get a 2xx status back. It doesn't have to be the same as the first.
The use-case is if a client sent a HTTP request but got disconnected before it got a response. The client should retry and if the server detects the request is the same as the first, the server can just give the client a success response (but don't do anything).
This is generally a good idea, because if the client got an error back for the second request, it might be harder to know if the request failed because it succeeded earlier, or for other reasons.
That all being said though, there is also a way here to have your cake and eat it too.
A client could send the following header along with the PUT request:
If-None-Match: *
If the client omits the header, you can always return 424 Precondition Required.
If the resource does not yet exist, it's a success response. If the resource was created earlier, you can return 412 Precondition Failed.
Using this mechanism a client has a standard way to figure out that the request failed because a successful one was made earlier.
Based on the docs here, PUT is the best method to use. The response should be 201 when it triggers the deployment, and either 200 or 204 when nothing is changed. It shouldn't be POST because calling a POST endpoint twice should trigger the effect both times.

Idempotentency of GET verb in an RESTful API

As it was mentioned here https://restfulapi.net/http-methods/ (and in other places as well):
GET APIs should be idempotent, which means that making multiple
identical requests must produce same result everytime until another
API (POST or PUT) has changed the state of resource on server.
How to make this true in an API that return time for example? or that return data that is affected by time.
In other words, each time I use GET http://ip:port/get-time-now/, it is going to return a different response. However, I did not send any POST or PUT between two sequenced GET's
Does this make the previous statement wrong? Did I misunderstand something?
Idempotency is a promise to clients/intermediaries that the request can be reissued in case of network failures or the like without any further considerations and not so much that the data will never change.
If you take a POST request for example, in case of a network failure you do not know if the previous request reached the server but the response got lost midway or if the initial request didn't even reach the server at all. If you re-issue the request you might create a further resource actually, hence POST is not idempotent. PUT on the other side has the contract that it replaces the current representation with the one contained in the request. If you send the same request twice the content of the resource should be the same after any of the two PUT requests was processed. Note that the actual result can still differ as the service is free to modify the received entity to a corresponding representation. Also, between sending the data via PUT and retrieving it via GET a further client could have updated the state in between, so there is no guarantee that you will actually receive the exact representation you've sent to the service.
Safetiness is an other promise that only GET, HEAD and OPTIONS supports. It promises the invoker that it wont modify any state at all hence clients/intermediaries are safe on issuing such request without having to fear that it will modify any state. In practice this is an important promise to crawlers which blindly invoke any URLs in order to learn their content. In case of violating such promises, i.e. by deleting data while processing a GET request the only one to blame is the service implementor but not the invoker. If a crawler invokes such URLs and hence removes some data it is not the crawlers fault actually but only the service implementor.
As you have a dynamic value in your response, you might want to prevent caching of responses though as otherwise intermediaries might return an old state for your resource
The main basic concept of idempotent and safe methods of HTTP:-
Idempotent Method:- The method can called multiple times with same input and it produce same result.
Safe Method:- The method can called multiple times with same input and it doesn't modify the resource onto the server side.
Http methods are categorized into following 3 groups-
GET,HEAD,OPTIONS are safe and idempotent
PUT,DELETE are not safe but idempotent
POST,PATCH are neither safe & nor idempotent

Recommended workflow for long-running create process

I'm designing a RESTful API that involves clients submitting requests to create a resource, let's say a report. For valid reasons not worth getting into, this process can take a minimum of 30 seconds and a maximum of several minutes to complete successfully. I'm trying to ensure this API is easy and intuitive to work with so I just want to get some feedback on what I was thinking of doing.
Client POSTs request body to /reports
Server responds with 202 Accepted and Location: /jobs/{jobId} header
Client GETs /jobs/{jobId} and receives 200 OK and response body like {"status": "pending"} (abbreviated)
Client retries until they get 200 OK (unchanged) and a body like{
"status": "complete",
"location": "/reports/{reportId}",
details": { ... }}
}
Client GETs /reports/{reportId} to retrieve their report
Some things I've though of doing differently from the above:
Having the /jobs/{jobId} resource return 303 See Other with a Location: /reports/{reportId} header when ready. A number of blog posts & SO answers I saw took that approach. I decided against it because we want to retain these jobs as first-class resources, e.g. we want to be able to view all jobs submitted in the past 24 hours, all failed jobs in the past 15 minutes, etc. Also it seems 303 See Other really should not return a body as clients should be expected to redirect to the Location url so they wouldn't see it anyway.
Having client POST to /jobs instead of /reports. I feel like either choice is defensible but it seems less surprising to clients that if they want to create a report they should POST to /reports. That being said, it may be surprising to get a response pointing them to a job instead of a report.
Whether clients POST to /reports or /jobs, since I am immediately creating a job, maybe I should return 201 Created instead of 202 Accepted.
Anyway, that's where my thinking currently is on this. Any confirmation, suggestions, respectfully explained disagreements, etc. are all greatly appreciated.
Thanks.
we want to retain these jobs as first-class resources, e.g. we want to be able to view all jobs submitted in the past 24 hours, all failed jobs in the past 15 minutes, etc.
If both jobs and reports are first-class, then I suggest giving each the usual obvious, boring semantics:
POST /jobs returns 201 Created and a Location: /jobs/{id} header. After all, the job was immediately created (the report is N/A).
Optionally you could have this return an ETag header (see next).
GET /jobs/{id} returns 200 OK; the response body indicates the readiness of the report and (if ready) the URI to /reports/{id}.
Optionally you could have this handle If-None-Match by returning 304 Not Modified until the readiness changes. And of course return an ETag header.
Of course GET /reports/{id} works.
p.s. If I'm not mistaken, Location headers ought to be full URIs e.g. https://example.com/path/to/thing not just relative paths /path/to/thing. If you do that, probably do likewise with any JSON response location values? That way, clients can just use the values as-is when making requests -- which is both more convenient for them, and better for you if they don't hardcode the protocol/host.

REST API: how to notify a client that the request has failed when the service already returned 200 and some data?

REST API: how to notify a client that the request has failed when the service already returned 200 and some data?
What I am doing?
I am developing a REST Web service that returns data from two sources:
An CSV file from an HTTP server which changes often and sometimes is huge.
A local file.
When a client invokes the service, it does this:
It sends a request to the HTTP server to obtain the CSV file.
After obtaining the CSV file, it combines the data from both sources.
Sends the result to the client. The result is an XML document.
Problem
Sometimes, after I have already returned some data to the client, the HTTP server fails so I cannot continue sending data to the client.
When this happens, I would like to notify the client that there was an error. How should I do this? The service already returned the HTTP code 200 and some data. So I cannot send the client an error 500.
Should I simply write to the output an error message? The client will fail because it the XML-document will not be valid.
The service cannot wait to send the response until the entire file from the HTTP server is read. The reasons is that sometimes the file obtained from the HTTP is very big and does not fit in memory.
Environment: although I do not think this is important, this service is developed in Jersey 1.x.
As you say, there are a couple options:
Start sending the response 200 OK before your upload request is complete, but rely on the client to detect an invalid ontent response; or
Wait until your request file upload is complete before sending the HTTP response. Then you can send the correct status code (2xx or 500).
I would recommend waiting until the upload is complete.
If the file cannot fit in server-side memory, then find a technique to write the stream to persistence not in memory, such as a cache, nosql db, or the filesystem. This will allow for faster processing of the file upload.
If you require additional time to process the file on the server side after upload, you can return a 202 Accepted status, with the Location: header having the resource to the long-running job. The client can keep checking if the job is complete. This will avoid having to process the whole thing in one HTTP round-trip.
some good examples of using RESTful long-operations:
Best practice for implementing long-running searches with REST
http://billhiggins.us/blog/2011/04/27/resty-long-ops/
REST with JAX-RS - Handling long running operations
Replying to myself. This may be useful for someone else.
Initially, I developed this option: if there was an error generating the output of the service when the HTTP code 200 was already sent, the service would write the error message to the output and close the connection. In these cases, the XML of the response was invalid.
Later, I had to change this behavior because users complained that in this scenario, the response was an invalid XML. As a consequence, all they were seeing was the error returned by the XML parser of their applications saying that the XML was invalid, not the actual error message.
To avoid this issue, I changed the behavior of the service:
When there are no errors, the response looks like this:
<view name="demo_stats">
<demo_stats>
<int_type>1</int_type>
<numeric_type>1.1</numeric_type>
</demo_stats>
<demo_stats>
<int_type>2</int_type>
<numeric_type>2.2</numeric_type>
</demo_stats>
</view>
If there is an error generating the output of the service and the service already sent the HTTP code 200, the response looks like this:
<view name="demo_stats">
<demo_stats>
<int_type>1</int_type>
<numeric_type>1.1</numeric_type>
</demo_stats>
<demo_stats>
<int_type>2</int_type>
<numeric_type>2.2</numeric_type>
</demo_stats>
<errors>
<error>There was an error transforming the value of row #3</error>
</errors>
</view>
The element errors is optional and only appears when there is an error during the generation of the output. This is a valid XML document and it allows client application to control better for this situation.

High Scale REST API

One of our REST APIs will cause a long-running process to execute. Rather than have the client wait for a long time, we would prefer to return an immediate response.
So, let's consider this use case: An applicant submits an application, for which there will be an eventual result. Since this is a very high-scale platform, we cannot persist the application to storage, but must place it onto a queue for processing.
In this situation, is it acceptable practice to return the URI where the application will eventually live, such as http://example.com/application/abc123?
Similarly, would it be acceptable practice to return the URI of the result document, which represents the decision regarding the application, as part of the representation of the application resource? The result document will not be created for some minutes, and an HTTP GET to its URI (or the URI of the application for that matter) will result in a 404 until they are persisted.
What is the best practice in this kind of situation? Is it acceptable to hand out "future" URIs for resources?
I don't see anything wrong with such design, but have a closer look at the list of HTTP status codes for better responses. IMHO the first request should return 202 Accepted:
The request has been accepted for processing, but the processing has not been completed.
while requests to the URL where the result will eventually be should in the meantime return 204 No Content (?):
The server successfully processed the request, but is not returning any content
And of course it should eventually return 200 OK when processing finishes.
From "RESTful Web Services Cookbook"
Problem
You want to know how to provide resource abstractions for
tasks such as performing computations or validating data.
Solution
Treat the processing function as a resource, and use HTTP GET to fetch
a representation containing the output of the processing function. Use
query parameters to supply inputs to the processing function.
This entails just GET requests on a URI that represents the processing function. Your example 'http://example.com/application/abc123' URI. When returning a response you would include what information you have by now and use HTTP codes to indicate the status of the processing as already suggested by Tomasz.
However..., you should not use this approach, if the subsequent application processing stores or modifies data in any way.
GET requests should never have side effects. If the submittal of the application leads in anyway (even if only after being processed in from queue) to new information / data being stored, you should use a PUT or a POST request with the application's data in the request's body. See "Why shouldn't data be modified on an HTTP GET request?" form more information.
If they application's submittal stores or modifies data, use the pattern for asynchronous processing: a POST or PUT request with the application's details.
For example
POST http://example.com/applications
which returns "201 Created" with the URI of the new application resource.
or
PUT http://example.com/applications/abc123
which returns "201 Created" and
Both would also return any resource information that is already known at that time.
You can then safely perform GET requests on the URI of the new resource as they now only retrieve data - the results of the application processing so far - and no data is stored or modified as a result of the GET.
To indicate the application's processing progress, the GET request can either return some specific status code in the response (queued, processing, accepted, rejected), and/or use the HTTP response codes. In either case a "200 OK" should only be returned when the application's processing is complete.