I have a website being served from a custom webserver, and it loads and works fine when loaded from a laptop/desktop browser, but loads inconsistently on mobile browsers. (In my case I tested specifically Samsung Internet and Chrome on Android)
(The exact behaviour is: load the web page, refresh, and then after a couple of refreshes it will sometimes not be able to load a background image, or any resource on the page at all - but only on mobile browsers)
In case this was just some cached data issue, I've cleared all browser data, restarted my phone, asked friends to try on their devices etc, but I've only been able to reproduce this on mobile devices.
My web server is written using liburing, nginx as a reverse proxy, though I doubt that would be the issue
I read Can Anyone Explain These Long Network Stalled Times? and it ocurred to me that an issue could be me using multiple different HTTP requests to get resources (I've not implemented Connection: Keep-Alive), but I also get this issue on WiFi, and I get the issue even when loading a single asset (such as a background image)
Additional possibly relevant info:
I was initially having a similar issue on desktop as well, and I fixed it by using shutdown() before calling close() on the HTTP requests
I'm using the following response headers:
Keep-Alive: timeout=0, max=0
Connection: close
Cache-Control: no-cache
I'm using the following socket options:
SO_REUSEADDR (mainly for debug convenience)
SO_REUSEPORT (sockets in multiple threads bind to and listen on the same port)
SO_KEEPALIVE, TCP_KEEPIDLE, TCP_KEEPINTVL and TCP_KEEPCNT (to kill off inactive clients)
Oddly enough though I think this disappears for a while after restarting my phone
I have tried not using nginx, instead using WolfSSL for TLS, and I get the same issue
I am inclined to think that this could be an issue with what headers I'm setting in responses (or possibly some HTTPS specific detail I'm missing?), but I'm not sure
And here's the actual site if anyone wants to verify the issue https://servertest.erewhon.xyz/
It looks to me like your server does not do a proper TLS shutdown, but is simply shutting down the underlying TCP connection. This causes your server to send a RST (packet 28) when the client is doing the proper TLS shutdown by sending the appropriate close notify TLS alert (packet 27).
This RST will result in a connection close on the client side. Depending on how fast the client has processed the incoming data this can result in abandoning still unread data in the TCP socket buffer, thus causing the problems you see.
The difference in behavior between mobile and desktop might just be caused by the performance of the systems and maybe by the underlying TCP stack. But no matter if the desktop works fine - your web server behaves wrong.
For details on how the connection close should happen at the HTTP level see RFC 7230 section 6.6. Note especially the following parts of this section:
If a server performs an immediate close of a TCP connection, there is
a significant risk that the client will not be able to read the last
HTTP response. If the server receives additional data from the
client on a fully closed connection, such as another request that was
sent by the client before receiving the server's response, the
server's TCP stack will send a reset packet to the client;
unfortunately, the reset packet might erase the client's
unacknowledged input buffers before they can be read and interpreted
by the client's HTTP parser.
To avoid the TCP reset problem, servers typically close a connection
in stages. First, the server performs a half-close by closing only
the write side of the read/write connection. The server then
continues to read from the connection until it receives a
corresponding close by the client, or until the server is reasonably
certain that its own TCP stack has received the client's
acknowledgement of the packet(s) containing the server's last
response. Finally, the server fully closes the connection.
Related
When establishing a web socket connection between client to server the connection can close unexpectedly for several reasons:
Inactivity on the TCP channel.
server issue which makes the connection to close.
Client crash or reload/ refresh.
I am looking for the way of dealing with such situations or, at least, know they occurred.
When reading about WebSocket close, I understood the WebSocket protocol support server initiated pings pongs which can be used for the server to know if a client has crashed. (client initiated ping pong are not supported). - is it the best way to deal with client crash?
Also, I see in spec that on the client side we can listen to the onClose event and that there are several codes to understand why connection has been closed -
When the server crashed is that onClose event is always called?
Both server and clients can send pings, however there is no method in the Javascript API to send such control frames.
A common approach is to send protocol level pings from the server to client regularly. If the server does not get a pong frame in a given time, the server disconnects the client.
However, clients should also know if the server is unreachable or the connection is half open, so having an application level ping/pong (i.e.: some user data representing a ping or pong) would allow both server and client figure out if the other end is not reachable anymore. As before, the server can send pings and expect pongs in a given time, but also the client can expect to have pings in a given time or consider itself disconnected, and then try to reconnect again.
About closing reasons, worth to check : getting the reason why websockets closed
If the server crashes and the connection remains half open, you will have to detect this situation yourself and call .close() on the WebSocket object, and then the onclose event will be called.
RFC 7230 defines HTTP/1.1 protocol and it has an interesting passage in 6.6, "Connection management. Tear-down":
To avoid the TCP reset problem, servers typically close a connection
in stages. First, the server performs a half-close by closing only the
write side of the read/write connection. The server then continues to
read from the connection until it receives a corresponding close by
the client, or until the server is reasonably certain that its own TCP
stack has received the client's acknowledgement of the packet(s)
containing the server's last response. Finally, the server fully
closes the connection.
Basically it boils down to the following:
shutdown(s, SD_SEND);
while (recv(s, throaway_buffer, throaway_buffer_len, 0) > 0);
closesocket(s);
which is the standard way of doing the graceful shutdown. However, it also acknowledges that a misbehaving client may exist (that keeps sending requests even after receiving a response with Connection: close header), and that the server has to cope with it by resetting the connection after it's sure the client has received the last response.
However, the socket interface doesn't seem to provide the functionality to learn whether all data passed to send have been actually sent and ACK'd by the remote host. Is it actually there? Without it, all I can think about is to set up a timer of sorts, and call recv until either it signals that the remote host has closed connection or the time is out, whichever comes first. But what would be the appropriate timeout? Is 60 seconds okay?
The Sockets interface provides this mean via the little-used and less understood SO_LINGER option. It allows you inter alia to define a timeout during which close() and possibly shutdown() will block while pending data is being sent. It is of little practical use and as I've stated it is rarely used ... at least rarely used correctly.
Keep-alives were added to HTTP to basically reduce the significant
overhead of rapidly creating and closing socket connections for each
new request. The following is a summary of how it works within HTTP
1.0 and 1.1:
HTTP 1.0 The HTTP 1.0 specification does not really delve into how
Keep-Alive should work. Basically, browsers that support Keep-Alive
appended an additional header to the request as [edited for clarity] explained below:
When the server processes the request and
generates a response, it also adds a header to the response:
Connection: Keep-Alive
When this is done, the socket connection is
not closed as before, but kept open after sending the response. When
the client sends another request, it reuses the same connection. The
connection will continue to be reused until either the client or
the server decides that the conversation is over, and one of them drops the connection.
The above explanation comes from here. But I don't understand one thing
When this is done, the socket connection is not closed as before, but
kept open after sending the response.
As I understand we just send tcp packets to make requests and responses, how this socket connection helps and how does it work? We still have to send packets, but how can it somehow establish the persistent connection? It seems so unreal.
There is overhead in establishing a new TCP connection (DNS lookups, TCP handshake, SSL/TLS handshake, etc). Without a keep-alive, every HTTP request has to establish a new TCP connection, and then close the connection once the response has been sent/received. A keep-alive allows an existing TCP connection to be re-used for multiple requests/responses, thus avoiding all of that overhead. That is what makes the connection "persistent".
In HTTP 0.9 and 1.0, by default the server closes its end of a TCP connection after sending a response to a client. The client must close its end of the TCP connection after receiving the response. In HTTP 1.0 (but not in 0.9), a client can explicitly ask the server not to close its end of the connection by including a Connection: keep-alive header in the request. If the server agrees, it includes a Connection: keep-alive header in the response, and does not close its end of the connection. The client may then re-use the same TCP connection to send its next request.
In HTTP 1.1, keep-alive is the default behavior, unless the client explicitly asks the server to close the connection by including a Connection: close header in its request, or the server decides to includes a Connection: close header in its response.
Let's make an analogy. HTTP consists in sending a request and getting the response. This is similar to asking someone a question, and receiving a response.
The problem is that the question and the answer need to go through the network. To communicate through the network, TCP (sockets) is used. That's similar to using the phone to ask a question to someone and having this person answer.
HTTP 1.0 consists, when you load a page containing 2 images for example, in
make a phone call
ask for the page
get the page
end the phone call
make a phone call
ask for the first image
get the first image
end the phone call
make a phone call
ask for the second image
get the second image
end the phone call
Making a phone call and ending it takes time and resources. Control data (like the phone number) must transit over the network. It would be more efficient to make a single phone call to get the page and the two images. That's what keep-alive allows doing. With keep-alive, the above becomes
make a phone call
ask for the page
get the page
ask for the first image
get the first image
ask for the second image
get the second image
end the phone call
This is is indeed networking question, but it may be appropriate here after all.
The confusion arises from distinction between packet-oriented and stream-oriented connections.
Internet is often called "TCP/IP" network. At the low level (IP, Internet Protocol) the Internet is packet-oriented. Hosts send packets to other hosts.
However, on top of IP we have TCP (Transmission Control Protocol). The entire purpose of this layer of the internet is to hide the packet-oriented nature of the underlying medium and to present the connection between two hosts (hosts and ports, to be more correct) as a stream of data, similar to a file or a pipe. We can then open a socket in the OS API to represent that connection, and we can treat that socket as a file descriptor (literally an FD in Unix, very similar to file HANDLE in Windows).
Most of the rest of Internet client-server protocols (HTTP, Telnet, SSH, SMTP) are layered on top of TCP. Thus a client opens a connection (a socket), writes its request (which is transmitted as one or more pockets in the underlying IP) to the socket, reads the response from a socket (and the response can contain data from multiple IP packets as well) and then... Then the choice is to keep the connection open for the next request or to close it. Pre-KeepAlive HTTP always closed the connection. New clients and servers can keep it open.
The advantage of KeepAlive is that establishing a connection is expensive. For short requests and responses it may take more packets than the actual data exchange.
The slight disadvantage may be that the server now has to tell the client where the response ends. The server cannot simply send the response and close the connection. It has to tell the client: "read 20KB and that will be the end of my response". Thus the size of the response has to be known in advance by the server and communicated to the client as part of higher-level protocol (e.g. Content-Length: in HTTP). Alternatively, the server may send a delimiter to specify the end of the response - it all depends on the protocol above TCP.
You can understand it this way:
HTTP uses TCP as transport. Before sending and receiving packets via TCP,
Client need to send the connect request
The server responds
Data transfer transfer is done
Connection is closed.
However if we are using keep-alive feature, the connection is not closed after receiving the data. The connection stays active.
This helps improving performance as for the next calls, the Connect establishment will not take place as the connection to the server is already there. This means less time taken. Although time takes in connecting is small but it do make a lot of difference in systems where every ms counts.
I am trying to write a programme that will 'listen' to application that is running on a port over TCP/IP.
When I point my browser to localhost:30003 , I get the output stream from the application printed to the screen. It would appear that the browser successfully 'listens' to the port.
What is happening here? Is my browser polling the application or is the application pushing tcp data which the browser picks up?
I am not sure whether to get this data I need to create a client or server instance.
One of the best ways to find out what is actually happening is to fire up Wireshark and follow the tcp stream.
http://www.wireshark.org/
Alternately, you can use something like TCP mon if you only care about the text, and none of the networking details.
http://ws.apache.org/commons/tcpmon/download.cgi
Based on the limited information in your question, the most likely thing is that the browser makes the tcp connection, and you send back a malformed response. The brower assumes you are a broken site, and does it's best to adjust. If you aren't sending the correct http header, it dosn't know what else to do so it probably just puts the text on the screen.
Best way to know the details is with wireshark or tcpmon
Pointing the browser to localhost:30003 will cause it the open the connection to port 30003 on the localhost and sent the string "GET /" to request a web page from what is thinks is a web host. Whatever text is sent by your app upon receiving a connection is simply displayed by the web browser as if it had received the contents of a text file on a web server.
when you write "localhost:30003" in your browser a connection is established to some program that listens to the port 30003 on your computer. The prefix in the URL, (default HTTP) determines the protocol used by server and client, in this case the browser is the client connecting to your PC, the server.
If you want to do the same with your program you can set up a socket connection to your localhost using the same port 30003. Your program then becomes the client. Depending on the program (which you don't mention anything about) you may have more protocol options and would need to handle the protocol in your program.
An alternative is to use telnet to connect to your program but it depends on available protocols.
I'm writing a simple HTTP server and learning about TIME_WAIT. How do real web servers in heavy environments handle requests from thousands of users without all the sockets getting stuck in TIME_WAIT after a request is handled? (Not asking about keep-alive -- that would help for a single client, but not for thousands of different clients coming through).
I've read that you try and get the client to close first, so that all the TIME_WAITs get spread out among all the clients, instead of concentrated on the server.
How is this done? At some point the server has to call close/closesocket.
The peer that initiates the active close is the one that goes into TIME_WAIT. So as long as the client closes the connection the client gets the TIME_WAIT and not the server. I go into this all in a little more detail in this blog posting. If you are unable to reach that link then the wayback machine has it.