Multiple service connections vs internal routing in MMO - sockets

The server consists of several services with which a user interacts: profiles, game logics, physics.
I heard that it's a bad practice to have multiple client connections to the same server.
I'm not sure whether I will use UDP or TCP.
The services are realtime, they should reply as fast as possible so I don't want to include any additional rerouting if there are no really important reasons. So are there any reasons to rerote traffic through one external endpoint service to specific internal services in my case?

This seems to be multiple questions in one package. I will try to answer the ones I can identify as separate...
UDP vs TCP: You're saying "real-time", this usually means UDP is the right choice. However, that means having to deal with lost packets and possible re-ordering of packets. But, using UDP leaves a couple of possible delay-decreasing tricks open.
Multiple connections from a single client to a single server: This consumes resources (end-points, as it were) on both the client (probably ignorable) and on the server (possibly a problem, possibly ignorable). The advantage of using separate connections for separate concerns (profiles, physics, ...) is that when you need to separate these onto separate servers (or server farms), you don't need to update the clients, they just need to connect to other end-points, using code that's already tested.
"Re-router" (or "load balancer") needed: Probably not going to be an issue initially. However, it will probably become an issue later. Depending on your overall design and server OS, using UDP may actually become an asset here. UDP packet arrives at the load balancer, dispatched to the right backend and that could then in theory send back a reply with the source IP of the load balancer.
An alternative would be to have a "session broker". The client makes an initial connection to a well-known endpoint, says "I am a client, tell me where my profile, physics, what-have0-you servers are", the broker considers the current load, possibly the location of the client and other things that may make sense and the client then connects to the relevant backends on its own. The downside of this is that it's harder (not impossible, but harder) to silently migrate an ongoing session to a new backend, when there's a load-balancer in the way, this can be done essentially-transparently.

Related

Writing a simple P2P chat application

This is my first experience with P2P and i need some help regarding the design.
I am developing a simple messenger application. I have a directory server on which every user authenticates and announces an open port on which every user is reachable. The directory server maintains the users and the ports and I can query the directory server for any specific user. This part is done. The second part is the chat which i think should be P2P. I can start a chat as well as I can be end point of a chat (client as well as server)
What is confusing me is how do I deal with P2P? Do I create two different sockets? One on which I am listening for TCP requests for incoming connections and another one from which I would send TCP requests to start chat.
In this case do I need 3 sockets, one to talk with server and two for P2P?
If you want to go P2P, you'd better use a framework, such as JXTA for example if you are coding in Java. Creating sockets may not be enough by itself, because there are more complicated issues you need to deal with such as NAT traversal if you are operating beyond your LAN.
It seems like you have a central peer (some of server). If it has a public IP address, then you could implement a TURN-like architecture (peers communicate via this central peer). If you want direct connection between peers, you are looking a STUN solutions, but you still need a central peer to facilitate the communication.
TCP Stun is not easy. UDP is not very complicated, you just need to punch a hole in your NAT. Now, keep in mind that NAT traversal is not always possible (it depends of the NAT itself). In this case, the backup solution in a STUN one.

Heartbeat Protocols/Algorithms or best practices

Recently I've added some load-balancing capabilities to a piece of software that I wrote. It is a networked application that does some data crunching based on input coming from a SQL database. Since the crunching can be pretty intensive I've added the capability to have multiple instances of this application running on different servers to split the load but as it is now the load balancing is a manual act. A user must specify which instances take which portion of the input domain.
I would like to take that to the next level and program the instances to automatically negotiate the diving up of the input data and to recognize if one of them "disappears" (has crashed or has been powered down) so that the remaining instances can take on the failed instance's workload.
In order to implement this I'm considering using a simple heartbeat protocol between the instances to determine who's online and who isn't and while this is not terribly complicated I'd like to know if there are any established heartbeat network protocols (based on UDP, TCP or both).
Obviously this happens a lot in the networking world with clustering, fail-over and high-availability technologies so I guess in the end I'd like to know if maybe there are any established protocols or algorithms that I should be aware of or implement.
EDIT
It seems, based on the answers, that either there are no well established heart-beat protocols or that nobody knows about them (which would imply that they aren't so well established after all) in which case I'm just going to roll my own.
While none of the answers offered what I was looking for specifically I'm going to vote for Matt Davis's answer since it was the closest and he pointed out a good idea to use multicast.
Thank you all for your time~
Distribued Interactive Simulation (DIS), which is defined under IEEE Standard 1278, uses a default heartbeat of 5 seconds via UDP broadcast. A DIS heartbeat is essentially an Entity State PDU, which fully defines the state, including the position, of the given entity. Due to its application within the simulation community, DIS also uses a concept referred to as dead-reckoning to provide higher frequency heartbeats when the actual position, for example, is outside a given threshold of its predicted position.
In your case, a DIS Entity State PDU would be overkill. I only mention it to make note of the fact that heartbeats can vary in frequency depending on the circumstances. I don't know that you'd need something like this for the application you described, but you never know.
For heartbeats, use UDP, not TCP. A heartbeat is, by nature, a connectionless contrivance, so it goes that UDP (connectionless) is more relevant here than TCP (connection-oriented).
The thing to keep in mind about UDP broadcasts is that a broadcast message is confined to the broadcast domain. In short, if you have computers that are separated by a layer 3 device, e.g., a router, then broadcasts are not going to work because the router will not transmit broadcast messages from one broadcast domain to another. In this case, I would recommend using multicast since it will span the broadcast domains, providing the time-to-live (TTL) value is set high enough. It's also a more automated approach than directed unicast, which would require the sender to know the IP address of the receiver in order to send the message.
Broadcast a heartbeat every t using UDP; if you haven't heard from a machine in more than k*t, then it's assumed down. Be careful that the aggregate bandwidth used isn't a drain on resources. You can use IP broadcast addresses, or keep a list of specific IPs you're doing work for.
Make sure the heartbeat includes a "reboot count" as well as "machine ID" so that you know previous server state isn't around.
I'd recommend using MapReduce if it fits. It would save a lot of work.
I'm not sure this will answer the question but you might be interested by the way Weblogic Server clustering work under the hood. From the book Mastering BEA WebLogic Server:
[...] WebLogic Server clustering provides a loose coupling of the servers in the cluster. Each server in the cluster is independent and does not rely on any other server for any fundamental operations. Even if contact with every other server is lost, each server will continue to run and be able to process the requests it receives. Each server in the cluster maintains its own list of other servers in the cluster through periodic heartbeat messages. Every 10 seconds, each server sends a heartbeat message to the other servers in the cluster to let them know it is still alive. Heartbeat messages are sent using IP multicast technology built into the JVM, making this mechanism efficient and scalable as the number of servers in the cluster gets large. Each server receives these heartbeat messages from other servers and uses them to maintain its current cluster membership list. If a server misses receiving three heartbeat messages in a row from any other server, it takes that server out of its membership list until it receives another heartbeat message from that server. This heartbeat technology allows servers to be dynamically added and dropped from the cluster with no impact on the existing servers’ configurations.
Cisco content switches are a hardware solution for this problem. They implement a virtual IP address as a front end to multiple real servers, whose real IP addresses are known to the switch. The switch periodically sends HTTP HEAD requests to the web servers, to verify they are still running (which the switch software calls a "keepalive", although this doesn't keep the server itself alive). The Cisco switch accepts traffic on the virtual IP and forwards it to the actual web servers, using configurable load balancing such as round-robin, or user-defined load balancing.
These switches retail in the $3-10K range, although my business partner picked one up on eBay for about $300 a year ago. If you can afford one, they do represent a proven hardware solution to the question of how to have a service spread transparently across multiple servers. Redhat includes a built-in port configuration so that you could implement your own Cisco switch using a cheap RedHat box. Google for "virtual ip address" and "cisco content router" for more information.
In addition to trying hardware load-balancers, you can also try a free-open-source load-balancing software application such as HAProxy, available for Linux and the BSDs.

Multiple TCP/IP servers and sharing the same "well known port" ... somehow?

I apologize for the weird question wording... here's the design problem:
I am developing a server (on Linux using C++, FWIW) that provides a service to many instances of a client application running on consumer PCs.
I want the following:
1) All clients first identify themselves to a "gatekeeper" server application. Consider this a login procedure, with credentials like a user name and password being passed in. Call the gatekeeper program "gserver". (for gatekeeper.)
2) Once each client has been validated, it is then placed into a long term connection with one of several instances of a different server application running on the same physical server box bound to the same server address. Call any of these instances "wserver" (for "working" server.)
So, what the client sees is that a "gatekeeper" application gives it passworded access to one of several "working" servers running on the same box.
Here is the "real" challenge: we want to exclusively use a "well known" port number for the inbound server connections (like port 80 or 443, say.) Or, our own "well known" port.
We would prefer not to have to make the client talk to a second port on the server for the long term connection phase with wserver(n). The problem with this, of course, is that only one server process at a time can be bound to the same port and server address.
This implies that a connection made by the client with gserver must also fill the role of the long term connection. The only way I see to accomplish this is that gserver must, after login, act like a proxy and copy traffic between itself and the client to the particular wserver(n) that the client is bound to logically.
It would be ideal if a TCP/IP connection first made between client(n) and gserver could be somehow "transported" to another application on the same server, intact, and could then be sustained by one of the wserver(n) instances for the long term connection.
I know that web servers do something like this for spreading out server loads. "Load balancing". The main difference here is that the "balancing" is the allocation of a particular user to a particular wserver(n) instance. But I also have the impression that load balancing is a kind of proxying - which I am trying to avoid (since it complicates the architecture and adds overhead as well as a single point of failure.)
This is a conceptual and design question. Don't worry about source code examples, unless they are absolutely essential to get the ideas across. If we pin down an approach, I can code it up.
Thanks!
What you are looking for is file descriptor passing. See UNP 15.7. One well-known heavy user of this facility is postfix.
I developed such an application long time ago. Since multiple servers can't listen on the same port. What you need is to have gserver listening on the well-known port. Once connection is established, pass the connection to the other servers via an Unix socket. Once the connection is passed to other server, gserver is out of picture. It can die and the other server will be still serving the connection.
I dont' know if this applies to your design, but the usual solution (as implemmented by the xinetd daemon) is to fork() and then exec() the process. For example, xinetd may serve services like rlogin, rsh, tftp, telnet, etc. which are actually served by different programs. This will not be useful to you if your wservers are processes already running in the system.

Scaleability of TCP keep-alive

Consider a large scale, heterogeneous network of various devices. These devices are providing services to others on the network in a peer-to-peer fashion. The mechanism used to track service availabilty across all nodes is currently using TCP sockets marked as keep-alive, usually for the duration the node is online. This leads to every node having a socket open with every other node (within a subnet of the peer-to-peer infrastructure).
What arguments exist regarding the scaleability of using TCP keep-alive in this way?
My alternative approach is to use a publish/subscribe model, where nodes push new services to the network as they become available, and their peers cache them for when they want to subscribe to a service. Does this sound feasable?
I interpret from what you wrote that the communication is strictly point-to-point, with considerable duration ('leases'). If this is true, it means that you will gain nothing in a publish-subscribe model. If this is not true, then yes, you should (could) change the network model to match the communications, and your idea sounds sound.
Regarding your second question, since TCP sockets and keep-alive is just a concept, there is no (or a very small) intrinsic cost of having such a keep-alive contract. In practice YMMV since different socket implementations require different resources, and other actions might be required to keep the channel open. There are however many implementations which require very little resources for open sockets (select()-type for example).
A discovery service (publish/subscribe of services) makes most sense if there are many implementers of the same type of service, and you cannot (or do not want to) predict statically where they will appear.
In short, I would say that you should only change the design if the type of communication that you have fits the current architecture badly. Your idea certainly sounds very feasible, but more information about the communication patterns would be necessary to make an estimation of the outcome.
Yeah using keep alive seems like a bad idea for any P2P network. Not only would I only have connections kept open while data is being transferred I would also keep node state updates on a different sockets altogether so as to not interfere with file transmissions.
If your TCP Keep Alive mechanism is being used only for tracking service availability (meaning, you never communicate service request/response across these connections), the use of TCP sockets is definitely an over kill. TCP sockets do take significant resources.
A more scalable method could be using a publish/subscribe model that uses UDP publish messages at regular intervals to advertise continued existence of the service. You could also use a service-down message published from a disconnecting node to gracefully declare end of service.
Going further, if you mean to get optimal with really large scale networks and, are ready to put in some time and effort -- consider a structured P2P mechanism like DHT.

Can socket connections be multiplexed?

Is it possible to multiplex sa ocket connection?
I need to establish multiple connections to yahoo messenger and i am looking for a way to do this efficiently without having to hold a socket open for each client connection.
so far i have to use one socket for each client and this does not scale well above 50,000 connections.
oh, my solution is for a TELCO, so i need to at least hit 250,000 to 500,000 connections
i'm planing to bind multiple IP addresses to a single NIC to beat the 65k port restriction per IP address.
Please i would any help, insight i can get.
**most of my other questions on this site have gone un-answered :) **
Thanks
This is an interesting question about scaling in a serious situation.
You are essentially asking, "How do I establish N connections to an internet service, where N is >= 250,000".
The only way to do this effectively and efficiently is to cluster. You cannot do this on a single host, so you will need to be able to fragment and partition your client base into a number of different servers, so that each is only handling a subset.
The idea would be for a single server to hold open as few connections as possible (spreading out the connectivity evenly) while holding enough connections to make whatever service you're hosting viable by keeping inter-server communication to a minimum level. This will mean that any two connections that are related (such as two accounts that talk to each other a lot) will have to be on the same host.
You will need servers and network infrastructure that can handle this. You will need a subnet of ip addresses, each server will have to have stateless communication with the internet (i.e. your router will not be doing any NAT in order to not have to track 250,000+ connections).
You will have to talk to AOL. There is no way that AOL will be able to handle this level of connectivity without considering cutting your connection off. Any service of this scale would have to be negotiated with AOL so both you and they would be able to handle the connectivity.
There are i/o multiplexing technologies that you should investigate. Kqueue and epoll come to mind.
In order to write this massively concurrent and teleco grade solution, I would recommend investigating erlang. Erlang is designed for situations such as these (multi-server, massively-multi-client, massively-multithreaded telecommunications grade software). It is currently used for running Ericsson telephone exchanges.
While you can listen on a socket for multiple incoming connection requests, when the connection is established, it connects a unique port on the server to a unique port on the client. In order to multiplex a connection, you need to control both ends of the pipe and have a protocol that allows you to switch contexts from one virtual connection to another or use a stateless protocol that doesn't care about the client's identity. In the former case you'd need to implement it in the application layer so that you could reuse existing connections. In the latter case you could get by using a proxy that keeps track of which server response goes to which client. Since you're connecting to Yahoo Messenger, I don't think you'll be able to do this since it requires an authenticated connection and it assumes that each connection corresponds to a single user.
You can only multiplex multiple connections over a single socket if the other end supports such an operation.
In other words it's a function protocol - sockets don't have any native support for it.
I doubt yahoo messenger protocol has any support for it.
An alternative (to multiple IPs on a single NIC) is to design your own multiplexing protocol and have satellite servers that convert from the multiplex protocol to the yahoo protocol.
I'll trow in another approach you could consider (depending on how desperate you are).
Note that operating system TCP/IP implementations need to be general purpose, but you are only interested in a very specific use-case. So it might make sense to implement a cut-down version of TCP/IP (which only handles your use-case, but does that very well) in your application code.
For example, if you are using Linux, you could route a couple of IP addresses to a tun interface and have your application handle the IP packets for that tun interface. That way you can implement TCP/IP (optimised for your use-case) entirely in your application and avoid any operating system restriction on the number of open connections.
Of course, it's quite a bit of work doing the TCP/IP yourself, but it really depends on how desperate you are - i.e. how much hardware can you afford to throw at the problem.
500,000 arbitrary yahoo messenger connections - is your telco doing this on behalf of Yahoo? It seems like whatever solution has been in place for many years now should be scalable with the help of Moore's Law - and as far as I know all the IM clients have been pretty effective for a long time, and there's no pressing increase in demand that I can think of.
Why isn't this a reasonable problem to address with hardware plus traditional solutions?