difference between http header code 413 and 414 - sockets

I'm working on a simple web server that I want to handle long Get values.
If request is larger than 4096, and it is GET value I want to send header code to client that understand request is too large.
client sends a huge cookie in buffer that make it larger than my web server can get.
which header code should I send? 414 Request-URI Too Long or 413 Payload Too Large?

client sends a huge cookie in buffer that make it larger than my web server can get.
The client should be sending back only the cookies that your web server has previously given the client. If your own cookies are too large for your web server to handle, you need to shorten them.
which header code should I send? 414 Request-URI Too Long or 413 Payload Too Large?
Neither. The request URI is not what is too long, so 414 is not appropriate. And a GET request does not have a body, only headers, so 413 is not appropriate, either.
The response code you should use is 431 Request Header Fields Too Large, which is defined in RFC 6585 Additional HTTP Status Codes:
5. 431 Request Header Fields Too Large
The 431 status code indicates that the server is unwilling to process
the request because its header fields are too large. The request MAY
be resubmitted after reducing the size of the request header fields.
It can be used both when the set of request header fields in total is
too large, and when a single header field is at fault. In the latter
case, the response representation SHOULD specify which header field
was too large.
For example:
HTTP/1.1 431 Request Header Fields Too Large
Content-Type: text/html
<html>
<head>
<title>Request Header Fields Too Large
</head>
<body>
<h1>Request Header Fields Too Large
<p>The "Example" header was too large.
</body>
</html>
Responses with the 431 status code MUST NOT be stored by a cache.

Related

Rest API text field limit

Let's say you have a function that needs to make an API call to another service to send it a message (say a log message). If in the REST API definition you set the field to a length limit of say 400 (let's say the field name is MyMessage), and the function sets that MyMessage field to a very long message that exceeds 400, what will happen? will the endpoint service receive the message automatically truncated to 400? or will it just not go through?
There is not much about content-length errors in the HTTP RFC. https://www.rfc-editor.org/rfc/rfc2616.html#page-119
So it depends on the HTTP server implementation if we are talking about HTTP 1.1. Sometimes 5xx, sometimes 4xx, but usually you got an error message I think.
As of HTTP 2, you will certainly got an error, most probably 400 bad request, because it is a malformed request. https://www.rfc-editor.org/rfc/rfc7540#section-8.1.2.6
8.1.2.6. Malformed Requests and Responses
A malformed request or response is one that is an otherwise valid
sequence of HTTP/2 frames but is invalid due to the presence of
extraneous frames, prohibited header fields, the absence of mandatory
header fields, or the inclusion of uppercase header field names.
A request or response that includes a payload body can include a
content-length header field. A request or response is also malformed
if the value of a content-length header field does not equal the sum
of the DATA frame payload lengths that form the body. A response
that is defined to have no payload, as described in [RFC7230],
Section 3.3.2, can have a non-zero content-length header field, even
though no content is included in DATA frames.
Intermediaries that process HTTP requests or responses (i.e., any
intermediary not acting as a tunnel) MUST NOT forward a malformed
request or response. Malformed requests or responses that are
detected MUST be treated as a stream error (Section 5.4.2) of type
PROTOCOL_ERROR.
For malformed requests, a server MAY send an HTTP response prior to
closing or resetting the stream. Clients MUST NOT accept a malformed
response. Note that these requirements are intended to protect
against several types of common attacks against HTTP; they are
deliberately strict because being permissive can expose
implementations to these vulnerabilities.
Though the easiest way would be trying it out instead of reading RFC-s.

304 response from a non modifying POST request

I am working on a POST HTTP API which does not modify or create any state on the server. The API is implemented with method POST as it needs to accept multiple complex inputs which would not be possible using query parameters.
What is the correct response status to return in case of conditional check failures (If-Match/If-None-Match) for such read-only POST APIs, should it be 304 Not Modified or 412 Precondition Failed?
Note: This is an internal service API where the client is aware that it is a non modifying request.
For a GET request, we would expect an If-None-Match header, which would normally produce a 200 response with an updated copy of the representation if the condition holds, and a 304 response when the precondition fails (meaning that the clients copy of the resource is already up to date).
The semantics should be the same when we use POST in a read-only way. (Note that we are in a bit of an edge case here; a general purpose client won't normally know that this particular POST request is safe, and probably won't know which precondition headers to attach to the request. For instance, if you try to use an HTML form in a web browser to access this resource, you probably aren't going to get the conditional headers that you want.)
412 Precondition Failed is normally used when requesting a modification to the resource, in combination with an If-Match header.
Reference: HTTP Semantics, section 13.

Correct REST API semantics to check email exists or not

I have an API that looks like GET user/exists/by?email=<email_here>.
The objective of the API is to check if the email exists or not.
I am confused over what should be the proper semantics of the API? Currently, I have 2 options.
Option 1:
Use HTTP status codes to drive the API.
Send 204 if the email exists
Send 404 if the email doesn't exist
Send 400 if email fails validation
Option 2:
Send 200 with body {"exists": true} when email exists
Send 200 with body {"exists": false} when email doesn't exists
Send 400 if email fails validation
Send 204 if the email exists
Send 404 if the email doesn't exist
I don't think you are going to find an authoritative answer to your question.
That said, one thing I would encourage you to do is to look at the HTTP responses being sent by your server, and in particular pay attention to the number of bytes of meta data being sent; the status-line, the headers, and so on.
Then consider carefully whether the difference between a small JSON payload and a zero length payload is all that significant.
Furthermore, recall that if a client doesn't want a copy of the representation to be returned, the client can use the method token HEAD rather than GET to request a refreshed copy of the resource meta data.
200 vs 404 is harder. 200 means that the payload is a representation of the requested resource (which is cacheable by default). 404 means that the payload is a representation of an error message (which is cacheable by default).
The HTTP status codes are metadata about the transfer of documents over a network domain. So it is really the wrong mechanism to use to finesse fine grained distinctions in your documents.
For instance, take a look at the cache invalidation specification, and notice please the distinction between the handling of 2xx and 4xx responses to unsafe requests.
As a matter of principle, HTTP data belongs in the headers, your data belongs in the body, and its only when your data is going to be of interest to general purpose HTTP components that we should be lifting copies of your data into the HTTP meta data.
But, as far as I know, not authoritative. This is all very hand wavy, and not easily matched to a specific RFC or advisory.

What is correct partial success response code

I have an API which returns a file to the client based on start & end date filters it sends to the server.
The server has a MAX_FILE _SIZE config value in order the reduce the amount of data it returns to the client.
In case the server is sending a truncated file due to its size what is the appropriate resonse code?
I think HTTP 206 is the one that seems to be most fitting.
The HTTP 206 Partial Content success status response code indicates that the request has succeeded and has the body contains the requested ranges of data, as described in the Range header of the request.
https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/206
https://www.restapitutorial.com/httpstatuscodes.html

ETags and collections

Many REST APIs provide the ability to search for resources.
For example, resources of type A may be fetched using the following HTTP request:
GET /A?prop1={value1}&prop2={value2}
I'm using optimistic locking and therefore would like to return a version for every returned resource of type A. Until now, I used the ETag header when fetching only one resource using its ID.
Is there an HTTP way for returning version for multiple resources in the same response? If not, should I include the versions in the body?
Thanks,
Mickael
EDIT: I found on the web that the ETag is often generated by computing a hash of part of the reply. This approach fits well with my case since a hash of the returned collection will be computed. However, if the client decides to update one of the elements in the collection, which ETag should he put in the If-Match header? I'm thinking that including the ETags of the individual elements is the only solution...
I would adopt one of these options:
Make ETags weak by default and they are generated with the resource current state, not with the resource representation in the HTTP response payload. With that, I can return a valid ETag for each resource in the collection query response body, besides the ETag for the whole collection in the response header.
Forget If-Match and ETags for this case and use If-Unmodified-Since with a Last-Modified supplied as a property of each resource. By doing that I can preserve the strong ETags, but clients can still make updates to one item based on the collection response without the need for another request to the resource itself.
Allow updates via PATCH on the collection resource itself, using the If-Match header with the ETag for the whole collection. This probably won't work very well if there's a lot of concurrent changes, but it's a reasonable approach.
I think it depends a little bit on the amount of resources, data and requests to reduce bandwith. But a solution could be to separate the resources in sub-requests.
Assume that the group call of GET /images?car=mustang&viewangle=front returns 5 images. Now you could include all images as binary data and the GET-request itself has a unique ETag:
GET /images?car=mustang&viewangle=front
...
HTTP 1.1 200 OK
ETag "aaaaaa"
data:image/png;base64,a123456....
data:image/png;base64,b123456....
data:image/png;base64,c123456....
data:image/png;base64,d123456....
data:image/png;base64,e123456....
The problem is now, that one added image changes the ETag of the group call and you need to transfer the complete set again altough only one image has changed:
GET /images?car=mustang&viewangle=front
If-None-Match "aaaaaa"
...
HTTP 1.1 200 OK
ETag "bbbbbb"
data:image/png;base64,a123456....
data:image/png;base64,b123456....
data:image/png;base64,c123456....
data:image/png;base64,d123456....
data:image/png;base64,e123456....
data:image/png;base64,f123456....
In this case the best solution would be that you separate the resources data from the group call. So the response includes only information for sub-requests:
GET /images?car=mustang&viewangle=front
...
HTTP 1.1 200 OK
ETag "aaaaaa"
a.jpg
b.jpg
c.jpg
d.jpg
e.jpg
By that every sub-request can be cached separatly:
GET /image/?src=a.jpg
If-None-Match "Akj5odjr"
...
HTTP 1.1 304 Not Modified
Statistics
- First request = 6x 200 OK
- Future requests if group unchanged = 1x 304 Not Modified
- Future requests if one new resource has been added = 2x 200 OK, 5x 304 Not Modified
Now I would tune the API documentation. This means the requester must check if a cache of a sub-request is available before making a call to it. This could be done by providing the ETags (or other hash) in the group request:
GET /images?car=mustang&viewangle=front
...
HTTP 1.1 200 OK
...
ETag "aaaaaa"
a.jpg;AfewrKJD
b.jpg;Bgnweidk
c.jpg;Ckirewof
d.jpg;Dt34gsd0
e.jpg;Egk29dds
f.jpg;F498wdn4
Now the requester checks the cache and finds out that a.jpg has a new ETag called Akj5odjr and f.jpg;F498wdn4 is a new entry. By that future requests are reduced:
Statistics
- First request = 6x 200 OK
- Future requests if group unchanged = 1x 304 Not Modified
- Future requests if one new resource has been added = 2x 200 OK
Conclusion
Finally you need to think about if your resources are big enough to put them in sub-requests and how often one requester repeats bis group request (so the cache is used). If not, you should include them in the group call and you do not have room for optimization.
P.S. you need to monitor all requesters to be sure all of them use caches. A possible solution would be to ban requesters calling an API URL two or more times without sending an ETag.