I am working on a client/server application. I have ready many articles for this and found a very common statement that "Creation/deletion of socket is very expensive process in terms of using system resources". But no where it is explained how it is consumes so much resources.
Can anybody give glimpse view on this?
Creating socket is cheap. Connecting it actually creates the connection, which is more or less as expensive as creating the underlying connection, specially TCP connection. TCP connection establish requires the three-way TCP handshake steps. Keeping connections live costs mainly memory and connections. Network connections are a resource limited by the operation systems (for example number of sockets on a port).
If you are using thread model additional thread creation resources needed.
I could find a useful like to your answer "Network Programming: to maintain sockets or not?" on Stackoverflow. And a useful article Boost socket performance on Linux
I think helpful to you.
Related
Would it be naive to create a TCP socket with a listen backlog set to minimum as a way of rate limiting new incoming connections? The server workload in question doesn't expect many new connections at any time but spends a lot of time servicing long open persistent connections. It appears that new incoming connections shouldn't affect established connections, though I've been unable to find any definitive answer in any text. Is it possible for failed new incoming connections to create some kind of TCP traffic congestion on the server with the packets it's receiving or are they dropped fast enough that it has no effect on any buffers or other part of the network stack?
Specifically the platform in use is Linux, and although it may be handled differently in different OSs, I expect them to all behave roughly the same.
EDIT What I mean by the "same" is that backlog doesn't affect established connections, though I do understand Linux discards them while Windows sends a reset.
Does listen() backlog affect established TCP connections?
It affects established connections that the server hasn't accepted yet via accept(), only in the sense that it limits the number of such connections that can exist.
Would it be naive to create a TCP socket with a listen backlog set to minimum as a way of rate limiting new incoming connections?
All it would accomplish would be to unnecessarily fail some connecting clients. They won't get any service until your server gets around to it anyway, and once the backlog queue fills they are rate-limited by your service code anyway. There is no particular reason why shortening the queue would have any beneficial effect. The other problem with the idea is that it isn't readily possible to determine what the minimum actually is, or whether you succeeded in setting it as the backlog queue length.
It appears that new incoming connections shouldn't affect established connections, though I've been unable to find any definitive answer in any text.
That is correct. There is no reason why it should affect them: that's why you won't find it written down anywhere, any more than the fact that the phase of the moon doesn't affect it either.
Is it possible for failed new incoming connections to create some kind of TCP traffic congestion on the server with the packets it's receiving
No.
or are they dropped fast enough that it has no effect on any buffers or other part of the network stack?
They're not dropped. They simply aren't even created if they won't fit on the backlog queue. Ergo their resource consumption at the server is zero.
Specifically the platform in use is Linux, and although it may be handled differently in different OSs, I expect them to all behave roughly the same.
They don't. On Windows, an incoming connection when the backlog queue is full causes an RST to be issued. On other platforms it is simply ignored.
What you describe are several types of attacks like flooding, syn attacks and other goodies resulting in denial of service.
This topic is not easy, because protection has to be implemented in all the layers, including TCP. For instance a SYN attack, fiddling with the sequence numbers, ... . At that point the packet in question already came a long way, through the ethernet layer and ip layer, bottom line it is taking resources. So if your system is under attack, the attacking packets are in your data stream just like the good ones are. The faster you can detect a packet is faulty and drop it, the better. Usually a system that is under attack will be slower. Well at least the systems that I have worked with.
Some attacks try to bring your system in a faulty state permanently, this by exploiting bugs. For instance TCP has a receive queue, if packets are constantly arriving out of order they will be stored in that receive queue. If the missing packet never arrives, then this receive queue could keep on growing and growing. Without the proper defense , this would lead to the system going completely out of resources.
There are specialised tools (codenumicon for instance) to check the vulnerability of a TCP stack implementation. You can assume that the one on linux has been properly tested using similar tools.
An attack can also occur on the application layer. If you have a TCP server and it allows only a limited amount of sessions. A malicious user can simply take all the connections simply by establishing all the connections and then not doing anything with it. So you have to create some defense as well. Weather or not you set this limit very low or high does not change a thing. A malicious user will try anything to bring your system down. You need to built in defense anyway. You can connect to a webserver (HTTP) simply using telnet. If you don't send anything the server's defense will come into play and close the connection.
So bringing the amount of possible connections to a low value and thinking that this in itself is a form of protection is indeed naive.
Is it possible for failed new incoming connections to create some kind of TCP traffic congestion on the server with the packets it's receiving or are they dropped fast enough that it has no effect on any buffers or other part of the network stack?
They are using resources of your machine and will make your system run slower.
It appears that new incoming connections shouldn't affect established connections, though I've been unable to find any definitive answer in any text.
If it is normal user trying to establish a connection, even if he is doing it continuously, retrying upon failure. The influence will be minimal, close to nothing. But a malicious user that is flooding connections attempts will have influence on the system performance, because the system has to spent time identifying those flawed packets and dropping them asap.
I've trying to understand the concept of channels and connections in RabbitMQ, I understand it in a high-level, a connection is a real connection implemented as TCP socket to the broker, channels are virtual connections that use the sane real connection to communicate. So channels are multiplexed through the same connection.
However in a low-level how is this implemented, TCP sockets are non-blocking? I've read that using multiple connections don't increase performance why not ? When a channel uses the connection i imagine the calls are serialized right? So wouldn't multiple connections allow me to send and receive data faster.
I know I'm missing something here so that's why I'm asking for some clarification.
Thanks.
Whether a server or client uses non-blocking sockets is an implementation detail. An implementation that needs high performance probably uses non-blocking socket; but e.g. the RabbitMQ server uses the usual lightweight Erlang process model to achieve concurrency.
You are free to use multiple AMQP connections - though in most cases you should be good with one connection and multiple channels. TCP has a relatively high overhead, and multiplexing channels on top of a TCP connection reduces this overhead.
In order to avoid the time spent on the creation of the sockets.
My node server need some "long connection"(TCP Socket) to keep communicating with the server written in C which runs in the background, and all the 'http request' could share the TCP sockets in the pool.
I wonder if there is a kind of socket pool implementation in nodejs? (something like the database connection pool)
Any help will be appreciated !
generic pool is a good one. https://github.com/coopernurse/node-pool
But as for http, there's built-in pooling mechanism .
Take a look at the official documentation for http.Agent; that's what (behind the scenes for most node developers) handles allocation of available sockets.
I am making a unix ssl server/client. So far I have implemented FD_SET with select to handle all connections concurrently in one master server process. However due to __FD_SETSIZE the number of clients can only be 1024. I need to increase the number of clients and efficiency of the server. Changing the __FD_SETSIZE has potential problems (apparently?) so I am stuck.
So far the network includes: errno.h detection, signal detection -> atomic handling, fd_set -> select(), successful stream socket based communication.
I would really appreciate it if someone can tell me what should I do? do I fork() after 1024 (which presents its own problems, if its even doable?) do I implement threads to handle each client request, or just client data or both?
What is the best network architecture in your opinion? keep in mind its a socket stream based connection that is meant to handle as much punishment as possible and allowing as many clients to the server as possible.
Don't write your own production web server.
There are too many open source servers out there all written by people who know more about high connectivity and SSL than you do. They also have the advantage of being tested to a degree that you'd never be able to accomplish with your homebrew server.
I have this question asked in the Go mailing list, but I think it is more general to get better response from SO.
When work with Java/.Net platform, I never had to manage database connection manually as the drivers handle it. Now, when try to connect to a no sql db with very basic driver support, it is my responsibility to manage the connection. The driver let connect, close, reconnect to a tcp port, but not sure how should i manage it (see the link). Do i have to create a new connection for each db request? can I use other 3rd party connection pooling libraries?
thanks.
I don't know enough about MongoDB to answer this question directly, but do you know how MongoDB handles requests over TCP? For example, one problem with a single TCP connection can be that the db will handle each request serially, potentially causing high latency even though it may be bottlenecking on a single machine and could handle a higher capacity.
Are the machines all running on a local network? If so, the cost of opening a new connection won't be too high, and might even be insignificant from a performance perspective regardless.
My two cents: Do one TCP connection per request and just profile it and see what happens. It is very easy to add pooling later if you're DoSing yourself, but it may never be a problem. That'll work right now, and you won't have to mess around with a third party library that may cause more problems than it solves.
Also, TCP programming is really easy. Don't be intimidated by it, detecting a closed socket, and reconnecting synchronously or asynchronously is simple.
Most mongodb drivers (clients) will create and use a connection pool when connecting to the server. Each socket (connection) can do one operation at a time at the server; because of how data is read off the socket you can issue many requests and server will just get them one after another and return data as each one completes.
There is a Go mongo db driver but it doesn't seem to do connection pooling. http://github.com/mikejs/gomongo
In addition to the answers here: if you find you do need to do some kind of connection pooling redis.go is a decent example of a database driver that pools connections. Specifically, look at the Client.popCon and Client.pushCon methods in the source.