How is HTTP Keep Alive implemented? Does it internally use TCP Keep Alive? If not, how does the server detect if the client is dead or alive?
I know this is an old question, but still:
HTTP Keep-Alive is a feature that allows HTTP client (usually browser) and server (webserver) to send multiple request/response pairs over the same TCP connection. This decreases latency for 2nd, 3rd,... HTTP request, decreases network traffic and similar.
TCP keepalive is a totally different beast. It keeps TCP connection opened by sending small packets. Additionally, when the packet is sent this serves as a check so the sender is notified as soon as connection drops (note that this is NOT the case otherwise - until we try to communicate through TCP connection we have no idea if it is ok or not).
To answer your questions about HTTP Keep-Alive:
How is HTTP Keep Alive implemented?
To put it simply, the HTTP server doesn't close the TCP connection after each response but waits some time if some other HTTP request will come over it too. After some timeout it closes it anyway.
Does it internally use TCP Keep Alive?
No, at least I see no point in it.
If not, how does the server detect if the client is dead or alive?
It doesn't - it doesn't need to. If a client sends a request, it will get the response. If the client doesn't send anything over TCP connection (maybe because the connection is dead) then a timeout will close the connection; client will of course notice this and will send request through another TCP connection if needed.
HTTP Keep-Alive is a feature of HTTP protocol. The web-server, implementing Keep-Alive Feature, has to check the connection/socket periodically (for incoming HTTP request) for the time span since it sent the last HTTP response (in case there was corresponding HTTP Request). If no HTTP request is received by the time of the configured keep-alive time (seconds) the web server closes the connection. No further HTTP request will be possible after the 'close' done by Web Server. On the other hand, TCP Keep-Alive is managed by OS in the TCP layer. HTTP Keep-Alive and TCP Keep-Alive is totally unrelated things.
HTTP keep-alive, a.k.a., HTTP persistent connection, is an instruction that allows a single TCP connection to remain open for multiple HTTP requests/responses.
By default, HTTP connections close after each request. When someone visits your site, their browser needs to create new connections to request each of the files that make up your web pages (e.g. images, Javascript, and CSS stylesheets), a process that can lead to high page load times.
Enabling the keep-alive header allows you to serve all web page resources over a single connection. Keep-alive also reduces both CPU and memory usage on your server.
Source: https://www.imperva.com/learn/performance/http-keep-alive/
http keep-alive is just making tcp living longer in order to transfer multi http request.After keep-alive timeout, the tcp connection will be closed.
tcp keep-alive is just a mechanism keeping the tcp connection,or check the tcp connection is not closed
Related
To establish a WebSocket connection, the client sends a WebSocket handshake request, for which the server returns a WebSocket handshake response, as shown in the example below.[30]
Client request (just like in HTTP, each line ends with \r\n and there must be an extra blank line at the end):
GET /chat HTTP/1.1
Host: server.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw==
Sec-WebSocket-Protocol: chat, superchat
Sec-WebSocket-Version: 13
Origin: http://example.com
Server response:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: HSmrc0sMlYUkAGmm5OPpG2HaGWk=
Sec-WebSocket-Protocol: chat
I can't understand it. To initialize the connection HTTP GET request is sent. But what if I don't host HTTP server but I just host WebSocket server? How does it handle HTTP requests then?
By design, the WebSocket protocol handshake uses HTTP so WebSockets can be utilized in existing HTTP servers and other HTTP-based technologies:
The WebSocket Protocol is designed to supersede existing bidirectional communication technologies that use HTTP as a transport layer to benefit from existing infrastructure (proxies, filtering, authentication). Such technologies were implemented as trade-offs between efficiency and reliability because HTTP was not initially meant to be used for bidirectional communication (see [RFC6202] for further discussion). The WebSocket Protocol attempts to address the goals of existing bidirectional HTTP technologies in the context of the existing HTTP infrastructure; as such, it is designed to work over HTTP ports 80 and 443 as well as to support HTTP proxies and intermediaries, even if this implies some complexity specific to the current environment. However, the design does not limit WebSocket to HTTP, and future implementations could use a simpler handshake over a dedicated port without reinventing the entire protocol. This last point is important because the traffic patterns of interactive messaging do not closely match standard HTTP traffic and can induce unusual loads on some components.
But, once the WebSocket handshake is finished, only the WebSocket protocol is used, not HTTP anymore.
So, it doesn't matter if you use an HTTP server with WebSocket support, or a dedicated WebSocket server. ANY WebSocket implementation MUST use HTTP (and all semantics therein, including authentication, redirection, etc) for the initial handshake. It is mandated by the WebSocket protocol specification, RFC 6455.
The opening handshake is intended to be compatible with HTTP-based server-side software and intermediaries, so that a single port can be used by both HTTP clients talking to that server and WebSocket clients talking to that server. To this end, the WebSocket client's handshake is an HTTP Upgrade request
So, a dedicated WebSockets server must be able to handle HTTP requests during the handshake phase, at least. It is not that hard to implement.
Do websocket implementations use http protocol internally?
Yes, initially, then they switch to the webSocket protocol. All webSocket connections start with an HTTP request with a header that requests an upgrade to the webSocket protocol. If the receiving server agrees, then the two sides switch protocols from HTTP to webSocket and from then on the connection uses the webSocket protocol.
So, ALL webSocket servers MUST support that initial HTTP request because that's how all webSocket connections are started.
The webSocket protocol is designed this way for a couple reasons:
So a single host and port can be used for both regular HTTP and webSocket connections. If you want, you can use a single server process to handle both types of connections. If they were on different ports, you would need two separate server processes.
So that webSockets can run on port 80 or 443 (along with an HTTP server) for maximum compatibility with various, already deployed network infrastructure such as corporate proxies, firewalls, etc.... that may only permit traffic on regular HTTP ports.
As you've seen, a webSocket request starts with an HTTP request like this:
GET /chat HTTP/1.1
Host: server.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw==
Sec-WebSocket-Protocol: chat, superchat
Sec-WebSocket-Version: 13
Origin: http://example.com
Note the Upgrade: websocket header. That tells the receiving HTTP server that this is a request for a webSocket connection. If the server wants to accept that connection, it returns a 101 response which tells the client that they can now switch protocols:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: HSmrc0sMlYUkAGmm5OPpG2HaGWk=
Sec-WebSocket-Protocol: chat
After this response, both client and server switch protocols (on the same TCP connection) and from then on they only speak the webSocket protocol and the TCP connection remains open until client or server explicitly close it (typically a long lived connection).
I can't understand it. To initialize the connection HTTP GET request is sent. But what if I don't host HTTP server but I just host WebSocket server?
All webSocket servers have to accept HTTP requests because all webSocket connections start with an HTTP request. So, there is no such thing as a webSocket server that doesn't accept an HTTP request. A pure webSocket only server can only accept HTTP requests that have the Upgrade: webSocket header and fail any other HTTP requests.
How does it handle HTTP requests then?
All webSocket servers expect incoming, new connections to start with HTTP so they must have a simple HTTP server built-in (can parse the initial HTTP headers).
I have the following queries:
1) Does TCP guarantee delivery of packets and thus is thus application level re-transmission ever required if transport protocol used is TCP. Lets say I have established a TCP connection between a client and server, and server sends a message to the client. However the client goes offline and comes back only after say 10 hours, so will TCP stack handle re-transmission and delivering message to the client or will the application running on the server need to handle it?
2) Related to the above question, is application level ACK needed if transport protocol is TCP. One reason for application ACK would be that without it, the application would not know when the remote end received the message. Is there any reason other than that? Meaning is the delivery of the message itself guaranteed?
Does TCP guarantee delivery of packets and thus is thus application level re-transmission ever required if transport protocol used is TCP
TCP guarantees delivery of message stream bytes to the TCP layer on the other end of the TCP connection. So an application shouldn't have to bother with the nuances of retransmission. However, read the rest of my answer before taking that as an absolute.
However the client goes offline and comes back only after say 10 hours, so will TCP stack handle re-transmission and delivering message to the client or will the application running on the server need to handle it?
No, not really. Even though TCP has some degree of retry logic for individual TCP packets, it can not perform reconnections if the remote endpoint is disconnected. In other words, it will eventually "time out" waiting to get a TCP ACK from the remote side and do a few retries. But will eventually give up and notify the application through the socket interface that the remote endpoint connection is in a dead or closed state. Typical pattern is that when a client application detects that it lost the socket connection to the server, it either reports an error to the user interface of the application or retries the connection. Either way, it's application level decision on how to handle a failed TCP connection.
is application level ACK needed if transport protocol is TCP
Yes, absolutely. Most client-server protocols has some notion of a request/response pair of messages. A TCP socket can only indicate to the application if data "sent" by the application is successfully queued to the kernel's network stack. It provides no guarantees that the application on top of the socket on the remote end actually "got it" or "processed it". Your protocol on top of TCP should provide some sort of response indication when ever a message is processed. Use HTTP as a good example here. Imagine if an application would send an HTTP POST message to the server, but there was not acknowledgement (e.g. 200 OK) from the server. How would the client know the server processed it?
In a world of Network Address Translators (NATs) and proxy servers, TCP connections that are idle (no data between each other) can fail as the NAT or proxy closes the connection on behalf of the actual endpoint because it perceives a lack of data being sent. The solution is to have some sort of periodic "ping" and "pong" protocol by which the applications can keep the TCP connection alive in the absences of having no data to send.
Since HTTP is an application layer protocol using TCP, if I request to download a big file via HTTP here is what happens:
My HTTP request is going to be fragmented into TCP packets, and TCP is going to do a 3-way handshake and send my request packets to server. My question is the response from server ( the file) going to pass through old TCP connection, or server initiates another Transport layer connection with my browser and another 3-way handshake in order to send me the file?
The file transfer will use the existing connection. that will however make the connection busy until the file is transferred.
So if the user clicks on a link while the file is downloaded the connection is then busy. The web browser will therefore have to open an additional connection to be able to request the clicked url.
In HTTP/1.1 existing connections will be used if idle (idle connections will be closed when a period of time have passed).
I'm trying to implement an HTTP proxy for learning and debug purpose.
The support of plain HTTP transactions was pretty straightforward to implement and now I'm looking to implement support for SSL/TLS tunnels.
From RFC 7230:
A "tunnel" acts as a blind relay between two connections without
changing the messages. Once active, a tunnel is not considered a party
to the HTTP communication, though the tunnel might have been initiated
by an HTTP request.
It's not very clear whether I shall build the TLS socket from the socket on which the HTTP CONNECT transaction took place. I assume it is the case, since HTTP is stateless, but I just want to be sure.
When a client connects to an HTTP proxy, CONNECT is used to have the proxy establish a persistent TCP connection with the target TCP server. Then the proxy blindly passes data as-is back and forth between the two TCP connections until either the client or server disconnects, then the proxy disconnects the other party. This allows the client to send data to the server and vice versa, such as TLS packets. This is important so the TLS server can verify the client's identity during the TLS handshake.
So, to answer your question - yes, the client must establish a TLS session with the target server using the same TCP socket that it used to issue the CONNECT request on. Once the CONNECT request has succeeded, the client can treat the existing TCP connection as if it had connected to the server directly. The proxy is transparent at that point, neither party needs to care that it is present.
I am exploring HTTP 1.1 persistent connection over single TCP socket for multiple HTTP request from client side. One thing I observed in wireshark is that, after each request-response my client sends an ACK to server. Is this ACK message call right according to protocol standard? Is there any way I can skip this ACK call. I compared the communication behaviour of my client with browser's communication pattern. I think browser does not send any tcp messages to server once tcp handshake is completed to establish connection.
ACK is part of TCP. You can't have a TCP connection without ACK, that's how it works. Data that is received is ACK'ed so the sender does not retransmit it.
HTTP is not dependent on TCP, you could implement HTTP on other protocols. The two protocols should be seen as separate layers, and should not influence each others.