.Net 4.5 TCP Server scale to thousands of connected clients - sockets

I need to build a TCP server using C# .NET 4.5+, it must be capable of comfortably handling at least 3,000 connected clients that will be send messages every 10 seconds and with a message size from 250 to 500 bytes.
The data will be offloaded to another process or queue for batch processing and logging.
I also need to be able to select an existing client to send and receive messages (greater then 500 bytes) messages within a windows forms application.
I have not built an application like this before so my knowledge is based on the various questions, examples and documentation that I have found online.
My conclusion is:
non-blocking async is the way to go. Stay away from creating multiple threads and blocking IO.
SocketAsyncEventArgs - Is complex and really only needed for very large systems, BTW what constitutes a very large system? :-)
BeginXXX methods will suffice (EAP).
Using TAP I can simplify 3. by using Task.Factory.FromAsync, but it only produces the same outcome.
Use a global collection to keep track of the connected tcp clients
What I am unsure about:
Should I use a ManualResetEvent when interacting with the TCP Client collection? I presume the asyc events will need to lock access to this collection.
Best way to detect a disconnected client after I have called BeginReceive. I've found the call is stuck waiting for a response so this needs to be cleaned up.
Sending messages to a specific TCP Client. I'm thinking function in custom TCP session class to send a message. Again in an async model, would I need to create a timer based process that inspects a message queue or would I create an event on a TCP Session class that has access to the TcpClient and associated stream? Really interested in opinions here.
I'd like to use a thread for the entire service and use non-blocking principals within, are there anythings I should be mindful of espcially in context of 1. ManualResetEvent etc..
Thank you for reading. I am keen to hear constructive thoughts and or links to best practices/examples. It's been a while since I've coded in c# so apologies if some of my questions are obvious. Tasks, async/await are new to me! :-)

I need to build a TCP server using C# .NET 4.5+
Well, the first thing to determine is whether it has to be base-bones TCP/IP. If you possibly can, write one that uses a higher-level abstraction, like SignalR or WebAPI. If you can write one using WebSockets (SignalR), then do that and never look back.
Your conclusions sound pretty good. Just a few notes:
SocketAsyncEventArgs - Is complex and really only needed for very large systems, BTW what constitutes a very large system? :-)
It's not so much a "large" system in the terms of number of connections. It's more a question of how much traffic is in the system - the number of reads/writes per second.
The only thing that SocketAsyncEventArgs does is make your I/O structures reusable. The Begin*/End* (APM) APIs will create a new IAsyncResult for each I/O operation, and this can cause pressure on the garbage collector. SocketAsyncEventArgs is essentially the same as IAsyncResult, only it's reusable. Note that there are some examples on the 'net that use the SocketAsyncEventArgs APIs without reusing the SocketAsyncEventArgs structures, which is completely ridiculous.
And there's no guidelines here: heavier hardware will be able to use the APM APIs for much more traffic. As a general rule, you should build a barebones APM server and load test it first, and only move to SAEA if it doesn't work on your target server's hardware.
On to the questions:
Should I use a ManualResetEvent when interacting with the TCP Client collection? I presume the asyc events will need to lock access to this collection.
If you're using TAP-based wrappers, then await will resume on a captured context by default. I explain this in my blog post on async/await.
There are a couple of approaches you can take here. I have successfully written a reliable and performant single-threaded TCP/IP server; the equivalent for modern code would be to use something like my AsyncContextThread class. It provides a context that will cause await to resume on that same thread by default.
The nice thing about single-threaded servers is that there's only one thread, so no synchronization or coordination is necessary. However, I'm not sure how well a single-threaded server would scale. You may want to give that a try and see how much load it can take.
If you do find you need multiple threads, then you can just use async methods on the thread pool; await will not have a captured context and so will resume on a thread pool thread. In this case, yes, you'd need to coordinate access to any shared data structures including your TCP client collection.
Note that SignalR will handle all of this for you. :)
Best way to detect a disconnected client after I have called BeginReceive. I've found the call is stuck waiting for a response so this needs to be cleaned up.
This is the half-open problem, which I discuss in detail on my blog. The best way (IMO) to solve this is to periodically send a "noop" keepalive message to each client.
If modifying the protocol isn't possible, then the next-best solution is to just close the connection after a no-communication timeout. This is how HTTP "persistent"/"keep-alive" connections decide to close. There's another possibile solution (changing the keepalive packet settings on the socket), but it's not as easy (requires p/Invoke) and has other problems (not always respected by routers, not supported by all OS TCP/IP stacks, etc).
Oh, and SignalR will handle this for you. :)
Sending messages to a specific TCP Client. I'm thinking function in custom TCP session class to send a message. Again in an async model, would I need to create a timer based process that inspects a message queue or would I create an event on a TCP Session class that has access to the TcpClient and associated stream? Really interested in opinions here.
If your server can send messages to any client (i.e., it's not just a request/response protocol; any part of the server can send messages to any client without the client requesting an update), then yes, you'll need a proper queue of outgoing requests because you can't (reliably) issue multiple concurrent writes on a socket. I wouldn't have the consumer be timer-based, though; there are async-compatible producer/consumer queues available (like BufferBlock<T> from TPL Dataflow, and it's not that hard to write one if you have async-compatible locks and condition variables).
Oh, and SignalR will handle this for you. :)
I'd like to use a thread for the entire service and use non-blocking principals within, are there anythings I should be mindful of espcially in context of 1. ManualResetEvent etc..
If your entire service is single-threaded, then you shouldn't need any coordination primitives at all. However, if you do use the thread pool instead of syncing back to the main thread (for scalability reasons), then you will need to coordinate. I have a coordination primitives library that you may find useful because its types have both synchronous and asynchronous APIs. This allows, e.g., one method to block on a lock while another method wants to asynchronously block on a lock.
You may have noticed a recurring theme around SignalR. Use it if you possibly can! If you have to write a bare-bones TCP/IP server and can't use SignalR, then take your initial time estimate and triple it. Seriously. Then you can get started down the path of painful TCP with my TCP/IP FAQ blog series.

Related

What happens to messages that come to a server implements stream processing after the source reached its bound?

Im learning akka streams but obviously its relevant to any streaming framework :)
quoting akka documentation:
Reactive Streams is just to define a common mechanism of how to move
data across an asynchronous boundary without losses, buffering or
resource exhaustion
Now, from what I understand is that if up until before streams, lets take an http server for example, the request would come and when the receiver wasent finished with a request, so the new requests that are coming will be collected in a buffer that will hold the waiting requests, and then there is a problem that this buffer have an unknown size and at some point if the server is overloaded we can loose requests that were waiting.
So then stream processing came to play and they bounded this buffer to be controllable...so we can predefine the number of messages (requests in my example) we want to have in line and we can take care of each at a time.
my question, if we implement that a source in our server can have a 3 messages at most, so if the 4th id coming what happens with it?
I mean when another server will call us and we are already taking care of 3 requests...what will happened to he's request?
What you're describing is not actually the main problem that Reactive Streams implementations solve.
Backpressure in terms of the number of requests is solved with regular networking tools. For example, in Java you can configure a thread pool of a networking library (for example Netty) to some parallelism level, and the library will take care of accepting as much requests as possible. Or, if you use synchronous sockets API, it is even simpler - you can postpone calling accept() on the server socket until all of the currently connected clients are served. In either case, there is no "buffer" on either side, it's just until the server accepts a connection, the client will be blocked (either inside a system call for blocking APIs, or in an event loop for async APIs).
What Reactive Streams implementations solve is how to handle backpressure inside a higher-level data pipeline. Reactive streams implementations (e.g. akka-streams) provide a way to construct a pipeline of data in which, when the consumer of the data is slow, the producer will slow down automatically as well, and this would work across any kind of underlying transport, be it HTTP, WebSockets, raw TCP connections or even in-process messaging.
For example, consider a simple WebSocket connection, where the client sends a continuous stream of information (e.g. data from some sensor), and the server writes this data to some database. Now suppose that the database on the server side becomes slow for some reason (networking problems, disk overload, whatever). The server now can't keep up with the data the client sends, that is, it cannot save it to the database in time before the new piece of data arrives. If you're using a reactive streams implementation throughout this pipeline, the server will signal to the client automatically that it cannot process more data, and the client will automatically tweak its rate of producing in order not to overload the server.
Naturally, this can be done without any Reactive Streams implementation, e.g. by manually controlling acknowledgements. However, like with many other libraries, Reactive Streams implementations solve this problem for you. They also provide an easy way to define such pipelines, and usually they have interfaces for various external systems like databases. In particular, such libraries may implement backpressure on the lowest level, down to to the TCP connection, which may be hard to do manually.
As for Reactive Streams itself, it is just a description of an API which can be implemented by a library, which defines common terms and behavior and allows such libraries to be interchangeable or to interact easily, e.g. you can connect an akka-streams pipeline to a Monix pipeline using the interfaces from the specification, and the combined pipeline will work seamlessly and supporting all of the backpressure features of Reacive Streams.

WebSocket/REST: Client connections?

I understand the main principles behind both. I have however a thought which I can't answer.
Benchmarks show that WebSockets can serve more messages as this website shows: http://blog.arungupta.me/rest-vs-websocket-comparison-benchmarks/
This makes sense as it states the connections do not have to be closed and reopened, also the http headers etc.
My question is, what if the connections are always from different clients all the time (and perhaps maybe some from the same client). The benchmark suggests it's the same clients connecting from what I understand, which would make sense keeping a constant connection.
If a user only does a request every minute or so, would it not be beneficial for the communication to run over REST instead of WebSockets as the server frees up sockets and can handle a larger crowd as to speak?
To fix the issue of REST you would go by vertical scaling, and WebSockets would be horizontal?
Doe this make sense or am I out of it?
This is my experience so far, I am happy to discuss my conclusions about using WebSockets in big applications approached with CQRS:
Real Time Apps
Are you creating a financial application, game, chat or whatever kind of application that needs low latency, frequent, bidirectional communication? Go with WebSockets:
Well supported.
Standard.
You can use either publisher/subscriber model or request/response model (by creating a correlationId with each request and subscribing once to it).
Small size apps
Do you need push communication and/or pub/sub in your client and your application is not too big? Go with WebSockets. Probably there is no point in complicating things further.
Regular Apps with some degree of high load expected
If you do not need to send commands very fast, and you expect to do far more reads than writes, you should expose a REST API to perform CRUD (create, read, update, delete), specially C_UD.
Not all devices prefer WebSockets. For example, mobile devices may prefer to use REST, since maintaining a WebSocket connection may prevent the device from saving battery.
You expect an outcome, even if it is a time out. Even when you can do request/response in WebSockets using a correlationId, still the response is not guaranteed. When you send a command to the system, you need to know if the system has accepted it. Yes you can implement your own logic and achieve the same effect, but what I mean, is that an HTTP request has the semantics you need to send a command.
Does your application send commands very often? You should strive for chunky communication rather than chatty, so you should probably batch those change request.
You should then expose a WebSocket endpoint to subscribe to specific topics, and to perform low latency query-response, like filling autocomplete boxes, checking for unique items (eg: usernames) or any kind of search in your read model. Also to get notification on when a change request (write) was actually processed and completed.
What I am doing in a pet project, is to place the WebSocket endpoint in the read model, then on connection the server gives a connectionID to the client via WebSocket. When the client performs an operation via REST, includes an optional parameter that indicates "when done, notify me through this connectionID". The REST server returns saying if the command was sent correctly to a service bus. A queue consumer processes the command, and when done (well or wrong), if the command had notification request, another message is placed in a "web notification queue" indicating the outcome of the command and the connectionID to be notified. The read model is subscribed to this queue, gets messessages and forward them to the appropriate WebSocket connection.
However, if your REST API is going to be consumed by non-browser clients, you may want to offer a way to check of the completion of a command using the async REST approach: https://www.adayinthelifeof.nl/2011/06/02/asynchronous-operations-in-rest/
I know, that is quite appealing to have an low latency UP channel available to send commands, but if you do, your overall architecture gets messed up. For example, if you are using a CQRS architecture, where is your WebSocket endpoint? in the read model or in the write model?
If you place it on the read model, then you can easy access to your read DB to answer fast search queries, but then you have to couple somehow the logic to process commands, being the read model the responsible of send the commands to the write model and notify if it is unable to do so.
If you place it on the write model, then you have it easy to place commands, but then you need access to your read model and read DB if you want to answer search queries through the WebSocket.
By considering WebSockets part of your read model and leaving command processing to the REST interface, you keep your loose coupling between your read model and your write model.

Server-side Websocket implementations in non-event driven HTTP Server Environments

I am trying to understand implementations/options for server-side Websocket endpoints - particularly in Perl using PSGI/Plack and I have a question: Why are all server-side websocket implementations based around event-driven PSGI servers (Twiggy, Tatsumaki, etc.)?
I get that websocket communication is asynchronous, but a non-event driven PSGI server (say Starman) could spawn an asynchronous listener to handle the websocket side of things. I have seen (but not understood) PHP implementations of Websocket servers, so why cant the same be done with PSGI without having to change the server to an event driven one?
Underlying network logic to deal with sockets depends on platform, OS and particular software implementations.
Most common three methods are:
pulling - there is blocking constant "asking" if socket has some data. This method is well bad, as it will block execution of main thread for as long as it waits for some data.
thread per socket - each new connection involves creating new thread and asking each socket in blocking manner happens within that thread. So it wont block main thread with logic. This method is bad as creating thread for each connection is too expensive for memory, and can be around 1Mb or RAM based on OS and other criteria.
async - uses system features to "notify" your process when there is something. So you can react once your app is ready (in case of single threaded app) or even react in separate thread straight away. This method is well efficient as it saves RAM, and allows your app to work without need of waiting or asking for data. It utilises existing functionalities that most OS and platforms provide.
Taking this in account, you indeed can create single process functional way to deal with sockets traffic. But that is not efficient at all as been proven previously. That is why fully async models are major today, as most languages and platforms do support such paradigm.

Winsock: Can i call send function at the same time for different socket?

Let's say, I have a server with many connected clients via TCP, i have a socket for every client and i have a sending and receiving thread for every client. Is it safe and possible to call send function at the same time as it will not call send function for same socket.
If it's safe and ok, Can i stream data to clients simultaneously without blocking send function for other clients ?
Thank you very much for answers.
Yes it is possible and thread-safe. You could have tested it, or worked out for yourself that IS, IIS, SQL Server etc. wouldn't work very well if it wasn't.
Assuming this is Windows from the tag of "Winsock".
This design (having a send/receive thread for every single connected client), overall, is not going to scale. Hopefully you are aware of that and you know that you have an extremely limited number of clients (even then, I wouldn't write it this way).
You don't need to have a thread pair for every single client.
You can serve tons of clients with a single thread using non-blocking IO and read/write ready notifications (either with select() or one of the varieties of Overlapped IO such as completion routines or completion ports). If you use completion ports you can set a pool of threads to handle socket IO and queue the work for your own worker thread or threads/threadpool.
Yes, you can send and receive to many sockets at once from different threads; but you shouldn't need those extra threads because you shouldn't be making blocking calls to send/recv at all. When you make a non-blocking call the amount that could be written immediately is written and the function returns, you then note how much was sent and ask for notification when the socket is next writable.
I think you might want to consider a different approach as this isn't simple stuff; if you're using .Net you might get by with building this with TcpListener or HttpListener (both of which use completion ports for you), though be aware that you can't easily disable Nagle's algorithm with those so if you need interactivity (think of the auto-complete on Google's search page) then you probably won't get the performance you want.

serving large file using select, epoll or kqueue

Nginx uses epoll, or other multiplexing techniques(select) for its handling multiple clients, i.e it does not spawn a new thread for every request unlike apache.
I tried to replicate the same in my own test program using select. I could accept connections from multiple client by creating a non-blocking socket and using select to decide which client to serve. My program would simply echo their data back to them .It works fine for small data transfers (some bytes per client)
The problem occurs when I need to send a large file over a connection to the client. Since i have only one thread to serve all client till the time I am finished reading the file and writing it over to the socket i cannot resume serving other client.
Is there a known solution to this problem, or is it best to create a thread for every such request ?
When using select you should not send the whole file at once. If you e.g. are using sendfile to do this it will block until the whole file has been sent. Instead use a small buffer, and send a little data at a time to each client. Then use select to identify when the socket is again ready to be written to and send some more until all data has been sent. This will allow you to handle multiple clients in parallel.
The simplest approach is to create a thread per request, but it's certainly not the most scalable approach. I think at this time basically all high-performance web servers use various asynchronous approaches built on things like epoll (Linux), kqueue (BSD), or IOCP (Windows).
Since you don't provide any information about your performance requirements, and since all the non-threaded approaches require restructuring your application to use these often-complex asynchronous techniques (as described in the C10K article and others found from there), for now your best bet is just to use the threaded approach.
Please update your question with concrete requirements for performance and other relevant data if you need more.
For background this may be useful reading http://www.kegel.com/c10k.html
I think you are using your callback to handle a single connection. This is not how it was designed. Your callback has to handle the whatever-thousand of connections you are planning to serve, i.e from the number of file descriptor you get as parameter, you have to know (by reading the global variables) what to do with that client, either read() or send() or ... whatever