Per socket data transfer limitations. Are download managers useful yet? - sockets

The idea behind breaking up a download into multiple segments with different ranges is for increasing download speed. This works if the server has a per connection limit. A server without that limitation theoretically servers the same bytes with one or more connections.
My question is if download managers still speed up downloading from such a server or it's just a useless effort. In other words is there any limitations per TCP socket connection by default or not?

No. There are no limitations per socket. Most OS:es will try to share the bandwidth equally between all sockets unless QoS is specified.

While a server could throttle bandwidth usage per connection, they typically do not bother. If a response is big enough that it could be effectively throttled then it's about the same impact to slower clients if a fast download just completes sooner.
Splitting a download into pieces may actually hurt your client's performance because of the way TCP operates -- it has a "slow start" mechanism that reduces throughput on new connections.
Websites that implement throttling will typically do so between their various virtual hosts (so that the download site doesn't starve out a more interactive one) or will do so based on the remote IP address.
By far the primary benefit of a download manager is that it will simply continue the download if the connection gets broken.

Related

Does listen() backlog affect established TCP connections?

Would it be naive to create a TCP socket with a listen backlog set to minimum as a way of rate limiting new incoming connections? The server workload in question doesn't expect many new connections at any time but spends a lot of time servicing long open persistent connections. It appears that new incoming connections shouldn't affect established connections, though I've been unable to find any definitive answer in any text. Is it possible for failed new incoming connections to create some kind of TCP traffic congestion on the server with the packets it's receiving or are they dropped fast enough that it has no effect on any buffers or other part of the network stack?
Specifically the platform in use is Linux, and although it may be handled differently in different OSs, I expect them to all behave roughly the same.
EDIT What I mean by the "same" is that backlog doesn't affect established connections, though I do understand Linux discards them while Windows sends a reset.
Does listen() backlog affect established TCP connections?
It affects established connections that the server hasn't accepted yet via accept(), only in the sense that it limits the number of such connections that can exist.
Would it be naive to create a TCP socket with a listen backlog set to minimum as a way of rate limiting new incoming connections?
All it would accomplish would be to unnecessarily fail some connecting clients. They won't get any service until your server gets around to it anyway, and once the backlog queue fills they are rate-limited by your service code anyway. There is no particular reason why shortening the queue would have any beneficial effect. The other problem with the idea is that it isn't readily possible to determine what the minimum actually is, or whether you succeeded in setting it as the backlog queue length.
It appears that new incoming connections shouldn't affect established connections, though I've been unable to find any definitive answer in any text.
That is correct. There is no reason why it should affect them: that's why you won't find it written down anywhere, any more than the fact that the phase of the moon doesn't affect it either.
Is it possible for failed new incoming connections to create some kind of TCP traffic congestion on the server with the packets it's receiving
No.
or are they dropped fast enough that it has no effect on any buffers or other part of the network stack?
They're not dropped. They simply aren't even created if they won't fit on the backlog queue. Ergo their resource consumption at the server is zero.
Specifically the platform in use is Linux, and although it may be handled differently in different OSs, I expect them to all behave roughly the same.
They don't. On Windows, an incoming connection when the backlog queue is full causes an RST to be issued. On other platforms it is simply ignored.
What you describe are several types of attacks like flooding, syn attacks and other goodies resulting in denial of service.
This topic is not easy, because protection has to be implemented in all the layers, including TCP. For instance a SYN attack, fiddling with the sequence numbers, ... . At that point the packet in question already came a long way, through the ethernet layer and ip layer, bottom line it is taking resources. So if your system is under attack, the attacking packets are in your data stream just like the good ones are. The faster you can detect a packet is faulty and drop it, the better. Usually a system that is under attack will be slower. Well at least the systems that I have worked with.
Some attacks try to bring your system in a faulty state permanently, this by exploiting bugs. For instance TCP has a receive queue, if packets are constantly arriving out of order they will be stored in that receive queue. If the missing packet never arrives, then this receive queue could keep on growing and growing. Without the proper defense , this would lead to the system going completely out of resources.
There are specialised tools (codenumicon for instance) to check the vulnerability of a TCP stack implementation. You can assume that the one on linux has been properly tested using similar tools.
An attack can also occur on the application layer. If you have a TCP server and it allows only a limited amount of sessions. A malicious user can simply take all the connections simply by establishing all the connections and then not doing anything with it. So you have to create some defense as well. Weather or not you set this limit very low or high does not change a thing. A malicious user will try anything to bring your system down. You need to built in defense anyway. You can connect to a webserver (HTTP) simply using telnet. If you don't send anything the server's defense will come into play and close the connection.
So bringing the amount of possible connections to a low value and thinking that this in itself is a form of protection is indeed naive.
Is it possible for failed new incoming connections to create some kind of TCP traffic congestion on the server with the packets it's receiving or are they dropped fast enough that it has no effect on any buffers or other part of the network stack?
They are using resources of your machine and will make your system run slower.
It appears that new incoming connections shouldn't affect established connections, though I've been unable to find any definitive answer in any text.
If it is normal user trying to establish a connection, even if he is doing it continuously, retrying upon failure. The influence will be minimal, close to nothing. But a malicious user that is flooding connections attempts will have influence on the system performance, because the system has to spent time identifying those flawed packets and dropping them asap.

UDP packet loss at the OS level: Windows and Linux, but only with many clients running

I've been investigating an issue where we have a single UDP server sending to multiple clients. The server is sending data out on a multicast channel and port. The clients are running on the same machine and each client opens a socket to the same port as every other client.
We stagger the start of the clients. When we reach a certain number of clients, say 10, we start seeing packet drops. We've eliminated the NIC as the issue using various monitoring tools and the socket buffer size is several times larger than the message size. Our sending interval is quite large (five seconds) and the clients do nothing with the data so the rate of consumption is a non-factor. As the title says we've reproduced the issue on both Windows Server 2008 and Linux (not sure about the version).
Our current theory is that the 10th client puts too much load on the OS which is copying all this data to each socket. The thing is we're only sending 500,000 bytes every five seconds, which doesn't seem like much at all.
Mostly I'm posting here in the hopes that someone has seen a similar problem. I was pointed to this hotfix in my search but it did not solve the issue. Any resources for investigating the details of the OS internals which handle network traffic would be appreciated. Unfortunately I lack this kind of domain knowledge and it has been difficult to find good and detailed reading material on the subject.

how do I write my own production web server?

I am making a unix ssl server/client. So far I have implemented FD_SET with select to handle all connections concurrently in one master server process. However due to __FD_SETSIZE the number of clients can only be 1024. I need to increase the number of clients and efficiency of the server. Changing the __FD_SETSIZE has potential problems (apparently?) so I am stuck.
So far the network includes: errno.h detection, signal detection -> atomic handling, fd_set -> select(), successful stream socket based communication.
I would really appreciate it if someone can tell me what should I do? do I fork() after 1024 (which presents its own problems, if its even doable?) do I implement threads to handle each client request, or just client data or both?
What is the best network architecture in your opinion? keep in mind its a socket stream based connection that is meant to handle as much punishment as possible and allowing as many clients to the server as possible.
Don't write your own production web server.
There are too many open source servers out there all written by people who know more about high connectivity and SSL than you do. They also have the advantage of being tested to a degree that you'd never be able to accomplish with your homebrew server.

server push for millions of concurrent connections

I am building a distributed system that consists of potentially millions of clients which all need to keep an open (preferrably HTTP) connection to wait for a command from the server (which is running somewhere else). The load of messages / commmands will not be very high, maybe one message / sec / 1000 clients which means it would be 1000 msg/sec # 1 million clients. => it's basically about the concurrent connections.
The requirements are simple too. One way messaging (server->client), only 1 client per "channel".
I am pretty open in terms of technology (xmpp / websockets / comet / ...). I am using Google App Engine as server, but their "channels" won't work for me unfortunately (too low quotas and no Java client). XMPP was an option but is quite expensive. So far I was using URL Fetch & pubnub, but they just started charging for connections (big time).
So:
Does anyone know of a service out there that can do that for me in an affordable way? Most I have found restrict or heavily charge for connections.
Any experience with implementing such a server yourself? I have actually done that already and it works pretty well (based on Tomcat & NIO) but I haven't had the time yet to actually set up a large load test environment (partially because this is still a fallback solution, I'd prefer a battle hardened msg server). Any experience to how many users you get per GB? Any hard limits?
My architecture also allows to fragment the msg servers, but I'd like to maximize the concurrent connections because the msg processing CPU overhead is minimal.
I have meanwhile implemented my own message server using netty.io. Netty makes use of Java NIO and scales extremely well. For idle connections I get a memory footprint of 500 bytes per connection. I am doing only very simple message forwarding (no caching, storage or other fancy stuff) but with that am easily getting 1000 - 1500 msg / sec (each half a KB) on the small amazon instance (1ECU / 1.6GB).
Otherwise if you are looking for a (paid) service then I can recommend spire.io (they do not charge for connections but have a higher price per message) or pubnub (they do charge for connections but are cheaper per message).
You have to look more in architecture of making such environment.
First of all, if you will write sockets management by yourself, then don't use Thread per Client Socket. Use Asynchronous methods for receiving and sending data.
WebSockets might be too heavy if your messages are small. Because it implements framing, which has to be applied to each message for each socket individually (caching can be used for different versions of WebSockets protocols), that makes them slower to process both directions: for receive and for send, especially because of data masking.
It is possible to create millions of sockets, but only most advanced technologies are capable to do so. Erlang is able to handle millions connections, and is pretty scalable.
If you would like to have millions of connections using other higher level technologies, then you need to think about clustering of what you are trying to accomplish.
For example using gateway server that will keep track of all processing servers. And have data of them (IP, ports, load (if it will be one internal network, firewalling and port forwarding might be handy here).
Client software connects to that gateway server, gateway server checks the least loaded server and sends ip and port to client. Client creates connection directly to working server using provided address.
That way you will have gateway which as well can handle authorization, and wont hold connections for long, so one of them might be enough. And many workers that are doing publishing of data and keeping connections.
This is very related to your needs, and might not be suitable for your solutions.

What is the practical / hard limit on socket connections per server

I have a number of client devices that open socket connection exposed by a service running on a Windows 2008 R2 server. I'm wondering if what is hard limit on the number of concurrent client connections.
According to this article, one hard limit is (was) 16,777,214. The practical limit depends on your application also: for example, if you create a thread per connection, then the practical limit comes from the limitation in the number of threads more than from the network stack. There is also a limit on the number of handles any process may have, and so on.
Assuming you select a sensible architecture for your server then the limit will be memory and cpu related. IMHO you'll never reach the hard limit that Martin mentions :)
So, rather than worrying about a theoretical limit that you'll never hit you should, IMHO, be thinking about how you will design your application and how you will test it to determine the current maximum number of client connections that you can maintain for your application on given hardware. The important thing for me is to run your perf tests from Day 0 (see here for a blog posting where I explain this). Modern operating systems and hardware allow you to build very scalable systems but simple day to day coding and design mistakes can easily squander that scalability and so you simply MUST run perf tests all the time so that you know when you are building in road blocks to your performance. You simply cannot go back and fix these kind of mistakes at the end of the project.
As an aside, I ran some tests on Windows 2003 Server with a low spec VM and easily achieved more than 70,000 concurrent and active connections with a simple server based on an overlapped I/O (I/O completion port) based design. See this answer for more details.
My personal approach would be to get a shell of a server put together quickly using whatever technology you decide on (I favour unmanaged C++ using I/O Completion Ports and minimal threads), see this blog posting for more details. Then build a client or series of clients that can stress test the application and keep updating and running the test clients as you implement your server logic. You would expect to see a gradually declining curve of maximum concurrent clients as you add more complexity to your server; large drops in scalability should cause you to examine the latest check ins to look for unfortunate design decisions.