How exacty HTTP protocol works? [duplicate] - sockets

This question already has answers here:
what happens when you type in a URL in browser [closed]
(3 answers)
Closed 8 years ago.
Sir i want to know how http works.When you type "www.youtube.com" in browser, following steps occurs.
- DNS look-up for "www.youtube.com" (suppose you get 1.1.1.1)
- Open socket to 1.1.1.1 port=80 and send a GET HTTP packet on it.
- Receive a response on that socket.
Am i right or there are any other steps?

You're correct, it's that simple though not dead-on in syntax.
Resolve domain if not an IP (DNS query)
Open port 80 by default if not SSL and not overridden by a colon (http: //host:port/)
Send request (#1) for http: //host/uri/here?other=stuff&too
Receive response (#2)
Example request: (#1) uses and must be ended by two Carriage Return and Line Feeds (CrLf)
GET /uri/here?other=stuff&too HTTP/1.1
Host: host
Other: Headers, too. Such as cookies
Header: Value
Example response: (#2)
HTTP/1.1 200 OK
Other: Headers, too. Such as cookies
Header: Value
<html>Actual HTTP payload is here, could be HTML data, downloaded file data, etc.

Related

Identical HTTP GET has status 200 on first request but 304 after that [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 10 months ago.
Improve this question
I'm confused with Status Code 304 "The HTTP 304 Not Modified client redirection response code indicates that there is no need to retransmit the requested resources", what does retransmit the requested resources refers to?
Context:
I'm using JSON SERVER to mock the back-end
Every time I request for the url the first time, it would be okay (200)
The following requests would result to status 304
The description of HTTP 304 sounds very vague to me, why would it say there's no need to retransmit the requested resources when I didn't cache the data originated from db.json in the first place.
Client code:
fetch(`http://localhost:4000/profiles`)
.then(res => res.json())
.then(data => setProfiles(data));
Server log:
GET /profiles 200 1.970ms - - # The first request is status 200
GET /profiles 304 2.609ms - - # The subsequent requests are status 304
GET /profiles 304 2.951ms - -
GET /profiles 304 2.903ms - -
# ...
To rephrase "the HTTP 304 Not Modified client redirection response code indicates that there is no need to retransmit the requested resources":
You request a resource (the data from /profiles) once, and the server sends you that data with a 200 OK status. Now when you request the same resource (/profiles) and its data hasn't changed from the first request, the server will respond with a 304 Not Modified since it assumes you already have the data — it hasn't been modified since the request that was 200 OK.
You say you didn't cache the data in the first request but the server is assuming you would (like the MDN docs for 304 Not Modified say, "It is an implicit redirection to a cached resource") and is trying to save resources.
You'll have to determine if it's best in your specific situation to
cache the first response,
read a possible Expires header and only rerequest after that moment in time, or
possibly omit an If-None-Match ETag header as to tell the server that you don't have a cached version of the resource to use

REST API Design: Respond with 406 or 404 if a resource is not available in a requested representation

We have a REST API to fetch binary files from the server.
The requests look like
GET /documents/e62dd3f6-18b0-4661-92c6-51c7258f9550 HTTP/1.1
Accept: application/octet-stream
For every response indicating an error, we'd like to give a reason in JSON.
The problem is now, that as the response is not of the same content type as the client requested.
But what kind of response should the server produce?
Currently, it responds with a
HTTP / 1.1 406 Not Acceptable
Content-Type: application/json
{
reason: "blabla"
...
}
Which seems wrong to me, as the underlying issue is, that the resource is not existing and not the client requesting the wrong content type.
But the question is, what would be the right way to deal with such situations?
Is it ok, to respond with 404 + application/json although application/octet-stream was requested
Is it ok, to respond with 406 + application/json, as the client did not specify an application/json as an acceptable type
Should spec been extended so that the client should use the q-param - for example, application/octet-stream, application/json;q=0.1
Other options?
If no representation can be found for the requested resource (because it doesn't exist or because the server wishes to "hide" its existence), the server should return 404.
If the client requests a particular representation in the Accept header and the server is not available to provide such representation, the server could either:
Return 406 along with a list of the available representations. (see note** below)
Simply ignore the Accept header and return a default representation of the resource.
See the following quote from the RFC 7231, the document the defines the content and semantics of the HTTP/1.1 protocol:
A request without any Accept header field implies that the user agent will accept any media type in response. If the header field is present in a request and none of the available representations for the response have a media type that is listed as acceptable, the origin server can either honor the header field by sending a 406 (Not Acceptable) response or disregard the header field by treating the response as if it is not subject to content negotiation.
Mozilla also recommends the following regarding 406:
In practice, this error is very rarely used. Instead of responding using this error code, which would be cryptic for the end user and difficult to fix, servers ignore the relevant header and serve an actual page to the user. It is assumed that even if the user won't be completely happy, they will prefer this to an error code.
** Regarding the list of available representations, see this answer.

Accessing Docker daemon with Rust doesn't work [duplicate]

I'm trying to issue a GET command to my local server using netcat by doing the following:
echo -e "GET / HTTP/1.1\nHost: localhost" | nc localhost 80
Unfortunately, I get a HTTP/1.1 400 Bad Request response for this. What, at the very minimum, is required for a HTTP request?
if the request is: "GET / HTTP/1.0\r\n\r\n" then the response contains header as well as body, and the connection closes after the response.
if the request is:"GET / HTTP/1.1\r\nHost: host:port\r\nConnection: close\r\n\r\n"
then the response contains header as well as body, and the connection closes after the response.
if the request is:"GET / HTTP/1.1\r\nHost: host:port\r\n\r\n" then the response contains header as well as body, and the connection will not close even after the response.
if your request is: "GET /\r\n\r\n" then the response contains no header and only body, and the connection closes after the response.
if your request is: "HEAD / HTTP/1.0\r\n\r\n" then the response contains only header and no body, and the connection closes after the response.
if the request is: "HEAD / HTTP/1.1\r\nHost: host:port\r\nConnection: close\r\n\r\n" then the response contains only header and no body, and the connection closes after the response.
if the request is: "HEAD / HTTP/1.1\r\nHost: host:port\r\n\r\n" then the response contains only header and no body, and the connection will not close after the response.
It must use CRLF line endings, and it must end in \r\n\r\n, i.e. a blank line. This is what I use:
printf 'GET / HTTP/1.1\r\nHost: www.example.com\r\nConnection: close\r\n\r\n' |
nc www.example.com 80
Additionally, I prefer printf over echo, and I add an extra header to have the server close the connection, but those aren’t needed.
See Wiki: HTTP Client Request (Example).
Note the following:
A client request (consisting in this case of the request line and only one header) is followed by a blank line, so that the request ends with a double newline, each in the form of a carriage return followed by a line feed. The "Host" header distinguishes between various DNS names sharing a single IP address, allowing name-based virtual hosting. While optional in HTTP/1.0, it is mandatory in HTTP/1.1.
The absolute minimum (if removing the Host is allowed ;-) is then GET / HTTP/1.0\r\n\r\n.
Happy coding
I was able to get a response from my Apache server with only the requested document, no response header, with just
GET /\r\n
If you want response headers, including the status code, you need one of the other answers here though.
The fact of the 400 Bad Request error itself does not imply that your request violates HTTP. The server very well could be giving this response for another reason.
As far as I know the absolute minimum valid HTTP request is:
GET / HTTP/1.0\r\n\r\n
Please, please, please, do not implement your own HTTP client without first reading the relevant specs. Please read and make sure that you've fully understood at least RFC 2616. (And if you're ambitious, RFC 7230 through 7235).
While HTTP looks like an easy protocol, there are actually a number of subtle points about it. Anyone who has written an HTTP server will tell you about the workarounds he had to implement in order to deal with incorrect but widely deployed clients. Unless you're into reading specifications, please use a well-established client library; Curl is a good choice, but I'm sure there are others.
If you're going to implement your own:
do not use HTTP/0.9;
HTTP/1.0 requires the query line and the empty line;
in HTTP/1.1, the Host: header is compulsory in addition to the above.
Omitting the Host: header in HTTP/1.1 is the most common cause of 400 errors.
You should add an empty line: \r\n\r\n
http://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol#Client_request
The really REALLY BARE minimum, is not using netcat, but using bash itself:
user#localhost:~$ exec 3<>/dev/tcp/127.0.0.1/80
user#localhost:~$ echo -e "GET / HTTP/1.1\n" >&3
user#localhost:~$ cat <&3
HTTP/1.0 200 OK
Server: SimpleHTTP/0.6 Python/2.7.6
Date: Mon, 13 Oct 2014 17:55:55 GMT
Content-type: text/html; charset=UTF-8
Content-Length: 514
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"><html>
<title>Directory listing for /</title>
<body>
<h2>Directory listing for /</h2>
<hr>
<ul>
</ul>
<hr>
</body>
</html>
user#localhost:~$

Watson Speech-to-Text register_callback returns only 400s

The Watson Speech-to-Text asynchronous HTTP interface allows one to register a callback url through a call to register_callback. This call is clearly not working; for illustration, please see these six lines of code.
# Illustration of how I can't get the Watson Speech-to-Text
# register_callback call to work.
r = requests.post(
"https://stream.watsonplatform.net/speech-to-text/api/v1/register_callback?{0}".format(
urllib.urlencode({ "callback_url": callback_url })),
auth=(watson_username, watson_password),
data="{}")
print(r.status_code)
print(pprint.pformat(r.json()))
# This outputs:
# 400
# {u'code': 400,
# u'code_description': u'Bad Request',
# u'error': u"unable to verify callback url 'https://xuyv2beqpj.execute-api.us-east-1.amazonaws.com/prod/SpeechToTextCallback' , server responded with status code: 400"}
# and no http call is logged on the server.
r = requests.get(
callback_url, params=dict(challenge_string="what does redacted mean?"))
print(r.status_code)
print(r.text)
# This outputs:
# 200
# what does redacted mean?
# and an HTTP GET is logged on the server.
I first call register_callback with a perfectly valid callback_url parameter, in exactly the way the documentation describes. This call returns with a 400 and, according to my callback URL server logs, the callback URL never receives an HTTP request. Then I GET the callback URL myself with a challenge_string. Not only is the callback URL responding with the right output, but a log appears on my server indicating the URL received an HTTP request. I conclude that register_call is not working.
Answer:
We identified the issue on our end: the server that makes the outbound calls to your URL did not support the SSL encryption method that your callback server uses. We have fixed that and we are in the process of pushing to the production environment very soon.
Also FYI:
The error message with 400 indicates the callback URL does not meet
request or does not exist. Please refer to the detail in
Speech-To-Text service API document,
http://www.ibm.com/watson/developercloud/speech-to-text/api/v1/?curl#register_callback
If the service does not receive a response with a response code of 200
and a body that echoes a random alphanumeric challenge string from the
callback URL within 5 seconds, it does not whitelist the URL; it
sends response code 400 in response to the registration request.
we just fixed the issue you reported. The problem was on our end, the servers responsible for making the callback to the server you set up did not support the cipher suites needed for establishing the SSL connection. We just updated the servers and we are happy to learn that it is now working for you: )
Dani

How to C - windows socket reading textfile content

I am having problems reading a text file content via winsock on C , does anyone have any idea how it should work? actually when I try to GET HTTP header from google am able to, but when I try on my xampp machine,
it just gives me 400 bad request.
HTTP/1.1 400 Bad Request
char *message = "GET / HTTP/1.1\r\n\r\n";
Ok the problem that I was receiving 400 bad request on my localhost via winsock was the my HTTP request, i just changed the 1.1 to 1.0 .. and it worked!!! what I am wanting now is printing nothing the content of the text file and not the whole banner?! :)
Read RFC 2616, in particular sections 5.2 and 14.23. An HTTP 1.1 request is required to include a Host header, and an HTTP 1.1 server is required to send a 400 reply if the header is missing and no host is specified in the request line.
char *message = "GET / HTTP/1.1\r\nHost: hostnamehere\r\n\r\n";
As for the text content, you need to read from the socket until you encounter a \r\n\r\n sequence (which terminates the response headers), then process the headers, then read the text content accordingly. The response headers tell you how to read the raw bytes of the text content and when to stop reading (refer to RFC 2616 section 4.4 for details). Once you have the raw bytes, the Content-Type header tells you how to interpret the raw bytes (data type, charset, etc).