I am not sure where the distributor/s should run when there is a number of clients and a number of servers. If I have a single distributor which all clients send to and all servers get work from then surely it is a single point of failure. Is there a way to remove this weak point?
You'd likely run a distributor on a cluster for high-availability.
That being said, you can go so far as to have a separate distributor for each message type and configure your clients to send each message type to its designated distributor. Then you can allocate servers to distributors based on the amount of resources you want to allocate per message type.
Does that answer your question?
Related
Assume there is a system that accepts millions of simultaneous WebSocket connections from client applications. I was wondering if there is a way to route WebSocket connections to a specific instance behind a load balancer (or IP/Domain/etc) if clients provide some form of metadata, such as hash key, instance name, etc.
For instance, let's say each WebSocket client of the above system will always belong to a group (e.g. max group size of 100), and it will attempt to communicate with 99 other clients using the above system as a message gateway.
So the system's responsibility is to relay messages sent from clients in a group to other 99 clients in the same group. Clients won't ever need to communicate with other clients who belong to different groups.
Of course, one way to tackle this problem is to use Pubsub system, such that regardless of which instance clients are connected to, the server can simply publish the message to the Pubsub system with a group identifier and other clients can subscribe to the messages with a group identifier.
However, the Pubsub system can potentially encounter scaling challenges, excessive resource usage (single message getting published to thousands of instances), management overhead, latency increase, cost, and etc.
If it is possible to guarantee that WebSocket clients in a group will all be connected to the instance behind LB, we can skip using the Pubsub system and make things simpler, lower latency, and etc.
Would this be something that is possible to do, and if it isn't, what would be the best option?
(I am using Kubernetes in one of the cloud service providers if that matters.)
Routing in HTTP is generally based on the hostname and/or URL path. Sometimes to a lesser degree on other headers like cookies. But in this case it would mean that each group should have it's own unique URL.
But that part is easy, what I think you're really asking is "given arbitrary URLs, how can I get consistent routing?" which is much, much more complicated. The base concept is "consistent hashing", you hash the URL and use that to pick which endpoint to talk to. But then how to do deal with adding or removing replicas without scrambling the mapping entirely. That usually means using a hash ring and assigning portions of the hash space to specific replicas. Unfortunately this is the point where off-the-shelf tools aren't enough. These kinds of systems require deep knowledge of your protocol and system specifics so you'll probably need to rig this up yourself.
I'm trying to build instant messaging app. Clients will not only send messages but also often send audios. And I've decided to use websocket connection to communicate with clients. It is fast and allows to send binary data.
The main idea is to receive from client1 message and notify about it client2. But here's the thing. My app will be running on GAE. And what if client1's socket is opened on server1 and client2's is opened on server2. This servers don't know about each others clients.
I have one idea how to solve it, but I am sure it is shitty way. I am going to use some sort of communication between servers(for example JMS or open another websocket connection between servers, doesn't matter right now).
But it surely will lead to a disaster. I can't even imagine how often those servers will speak to each other. For each message server1 should notify server2, server2 should notify client2. But things become even worse when serverN comes into play.
Another way I see this to work is Firebase. But it restricts message size to 4KB. So I can't send audios via it. As a solution I can notify client about new audio and he goes to my server for it.
Hope I clearly explained the problem. Does anyone know how to solve it? Or maybe there are another ways to build such apps?
If you are building a messaging cluster and expect communicating clients to connect to different instances of the server then server-server communication is inevitable. Usually it's not a problem though.
First, if you don't use any load balancing your clients will connect to the same server 50% of time on average (in case of 2 servers).
Second, intra-datacenter links are fast and free in all known public clouds.
Third, you can often do something smart on the frontend to make sure two likely to communicate clients connect to the same server. For instance direct all clients from the same country to the same server using DNS load balancing.
The second part of the question is about passing large media files. It's a common best practice to send it out of band - store on the server and only pass the reference to it. Like someone suggested in the comment, save the audio on the server and just send a message like "audio is available, fetch it from here ...". You don't need to poll the server for that. Just fetch it once when the receiving client requests it.
In general, it seems like you are trying to reinvent the wheel. Just use something off the shelf.
Let all client get connected to multiple servers and each server keeps this metadata
A centralized system like zookeeper stores active servers details
When a client c1 sends a message to client c2:
the message is received by a server (say s1, we can add a load balancer to distribute incoming requests)
s1 will broadcast this information to all other servers to get which server the client c2 is connected to OR a better approach to use consistent hashing which decides which server the client can connect to & in this approach message broadcast is not required
the corresponding server responses to server s1 (say s2)
now s1 sends the message m to s2 and server s2 to client c2
Cons of the above approach:
Each server will have a connection with the n-1 servers, creating a mesh topology
Centralized system (zookeeper) becomes a single point of failures (which is solvable)
Apps like Whatsapp, G-Talk uses XMPP and TCP/IP.
We are considering using MSMQ as a message processing service in our application. What we want is - MSMQ in a farm of servers sitting behind Network Load Balancer (NLB) that distributes load among the nodes. I have the following questions for which I couldn't find any pointers online.
1) Do I need to have MSMQ installed in all nodes?
2) If so, when the sender application sends a message to MSMQ should it send to all nodes?
3) If answer to my 2nd question is yes, then when a message is received by the client application, how should the other nodes be notified about the message being received by the client and subsequently remove from them?
Would appreciate any help/pointers.
Here are some blog posts that may help:
Oil and water - MSMQ transactional messages and load balancing
Load-balancing MSMQ - a brief discussion
Q1) Do I need to have MSMQ installed in all nodes?
A1) You need MSMQ installed on every server that a message will be delivered too.
Q2) If so, when the sender application sends a message to MSMQ should it send to all nodes?
A2) That's one way to load-balance where you send X messages to X nodes but only read from 1 node, discarding the other copies.
Another method is to send to a network load balancer that sends one message to one randomly-selected node, although MSMQ is very sticky so very difficult/impossible to load balance high volumes in that way.
Q3) If answer to my 2nd question is yes, then when a message is received by the client application, how should the other nodes be notified about the message being received by the client and subsequently remove from them?
A3) That is the down-side to the type of load-balancing you're suggested. MSMQ won't remove a message unless either your application removes it or the TimeToBeReceived timer expires. basically, you will need to have co-ordinated client applications if you need to avoid duplicate message processing.
Cheers
John Breakwell
We have clustered MSMQ for a set of NServiceBus services, and everything runs great until it doesn't. Outgoing queues on one server start filling up, and pretty soon the whole system is hung.
More details:
We have a clustered MSMQ between servers N1 and N2. Other clustered resources are only services that operate directly on the clustered queues as local, i.e. NServiceBus distributors.
All of the worker processes live on separate servers, Services3 and Services4.
For those unfamiliar with NServiceBus, work goes into a clustered work queue managed by the distributor. Worker apps on Service3 and Services4 send "I'm Ready for Work" messages to a clustered control queue managed by the same distributor, and the distributor responds by sending a unit of work to the worker process's input queue.
At some point, this process can get completely hung. Here is a picture of the outgoing queues on the clustered MSMQ instance when the system is hung:
If I fail over the cluster to the other node, it's like the whole system gets a kick in the pants. Here is a picture of the same clustered MSMQ instance shortly after a failover:
Can anyone explain this behavior, and what I can do to avoid it, to keep the system running smoothly?
Over a year later, it seems that our issue has been resolved. The key takeaways seem to be:
Make sure you have a solid DNS system so when MSMQ needs to resolve a host, it can.
Only create one clustered instance of MSMQ on a Windows Failover Cluster.
When we set up our Windows Failover Cluster, we made the assumption that it would be bad to "waste" resources on the inactive node, and so, having two quasi-related NServiceBus clusters at the time, we made a clustered MSMQ instance for Project1, and another clustered MSMQ instance for Project2. Most of the time, we figured, we would run them on separate nodes, and during maintenance windows they would co-locate on the same node. After all, this was the setup we have for our primary and dev instances of SQL Server 2008, and that has been working quite well.
At some point I began to grow dubious about this approach, especially since failing over each MSMQ instance once or twice seemed to always get messages moving again.
I asked Udi Dahan (author of NServiceBus) about this clustered hosting strategy, and he gave me a puzzled expression and asked "Why would you want to do something like that?" In reality, the Distributor is very light-weight, so there's really not much reason to distribute them evenly among the available nodes.
After that, we decided to take everything we had learned and recreate a new Failover Cluster with only one MSMQ instance. We have not seen the issue since. Of course, making sure this problem is solved would be proving a negative, and thus impossible. It hasn't been an issue for at least 6 months, but who knows, I suppose it could fail tomorrow! Let's hope not.
Maybe your servers were cloned and thus share the same Queue Manager ID (QMId).
MSMQ use the QMId as a hash for caching the address of remote machines. If more than one machine has the same QMId in your network you could end up with stuck or missing messages.
Check out the explanation and solution in this blog post: Link
How are your endpoints configured to persist their subscriptions?
What if one (or more) of your service encounters an error and is restartet by the Failoverclustermanager? In this case, this service would never receive one of the "I'm Ready for Work" message from the other services again.
When you fail over to the other node, I guess that all your services send these messages again and, as a result, everything gets back working.
To test this behavior do the following.
Stop and restart all your services.
Stop only one of the services.
Restart the stopped service.
If your system does not hang, repeat this with each single service.
If your system now hangs again, check your configurations. It this scenario your at least one, if not all, services lose the subscriptions between restarts. If you did not do so already, persist the subscription in a database.
Recently I've added some load-balancing capabilities to a piece of software that I wrote. It is a networked application that does some data crunching based on input coming from a SQL database. Since the crunching can be pretty intensive I've added the capability to have multiple instances of this application running on different servers to split the load but as it is now the load balancing is a manual act. A user must specify which instances take which portion of the input domain.
I would like to take that to the next level and program the instances to automatically negotiate the diving up of the input data and to recognize if one of them "disappears" (has crashed or has been powered down) so that the remaining instances can take on the failed instance's workload.
In order to implement this I'm considering using a simple heartbeat protocol between the instances to determine who's online and who isn't and while this is not terribly complicated I'd like to know if there are any established heartbeat network protocols (based on UDP, TCP or both).
Obviously this happens a lot in the networking world with clustering, fail-over and high-availability technologies so I guess in the end I'd like to know if maybe there are any established protocols or algorithms that I should be aware of or implement.
EDIT
It seems, based on the answers, that either there are no well established heart-beat protocols or that nobody knows about them (which would imply that they aren't so well established after all) in which case I'm just going to roll my own.
While none of the answers offered what I was looking for specifically I'm going to vote for Matt Davis's answer since it was the closest and he pointed out a good idea to use multicast.
Thank you all for your time~
Distribued Interactive Simulation (DIS), which is defined under IEEE Standard 1278, uses a default heartbeat of 5 seconds via UDP broadcast. A DIS heartbeat is essentially an Entity State PDU, which fully defines the state, including the position, of the given entity. Due to its application within the simulation community, DIS also uses a concept referred to as dead-reckoning to provide higher frequency heartbeats when the actual position, for example, is outside a given threshold of its predicted position.
In your case, a DIS Entity State PDU would be overkill. I only mention it to make note of the fact that heartbeats can vary in frequency depending on the circumstances. I don't know that you'd need something like this for the application you described, but you never know.
For heartbeats, use UDP, not TCP. A heartbeat is, by nature, a connectionless contrivance, so it goes that UDP (connectionless) is more relevant here than TCP (connection-oriented).
The thing to keep in mind about UDP broadcasts is that a broadcast message is confined to the broadcast domain. In short, if you have computers that are separated by a layer 3 device, e.g., a router, then broadcasts are not going to work because the router will not transmit broadcast messages from one broadcast domain to another. In this case, I would recommend using multicast since it will span the broadcast domains, providing the time-to-live (TTL) value is set high enough. It's also a more automated approach than directed unicast, which would require the sender to know the IP address of the receiver in order to send the message.
Broadcast a heartbeat every t using UDP; if you haven't heard from a machine in more than k*t, then it's assumed down. Be careful that the aggregate bandwidth used isn't a drain on resources. You can use IP broadcast addresses, or keep a list of specific IPs you're doing work for.
Make sure the heartbeat includes a "reboot count" as well as "machine ID" so that you know previous server state isn't around.
I'd recommend using MapReduce if it fits. It would save a lot of work.
I'm not sure this will answer the question but you might be interested by the way Weblogic Server clustering work under the hood. From the book Mastering BEA WebLogic Server:
[...] WebLogic Server clustering provides a loose coupling of the servers in the cluster. Each server in the cluster is independent and does not rely on any other server for any fundamental operations. Even if contact with every other server is lost, each server will continue to run and be able to process the requests it receives. Each server in the cluster maintains its own list of other servers in the cluster through periodic heartbeat messages. Every 10 seconds, each server sends a heartbeat message to the other servers in the cluster to let them know it is still alive. Heartbeat messages are sent using IP multicast technology built into the JVM, making this mechanism efficient and scalable as the number of servers in the cluster gets large. Each server receives these heartbeat messages from other servers and uses them to maintain its current cluster membership list. If a server misses receiving three heartbeat messages in a row from any other server, it takes that server out of its membership list until it receives another heartbeat message from that server. This heartbeat technology allows servers to be dynamically added and dropped from the cluster with no impact on the existing servers’ configurations.
Cisco content switches are a hardware solution for this problem. They implement a virtual IP address as a front end to multiple real servers, whose real IP addresses are known to the switch. The switch periodically sends HTTP HEAD requests to the web servers, to verify they are still running (which the switch software calls a "keepalive", although this doesn't keep the server itself alive). The Cisco switch accepts traffic on the virtual IP and forwards it to the actual web servers, using configurable load balancing such as round-robin, or user-defined load balancing.
These switches retail in the $3-10K range, although my business partner picked one up on eBay for about $300 a year ago. If you can afford one, they do represent a proven hardware solution to the question of how to have a service spread transparently across multiple servers. Redhat includes a built-in port configuration so that you could implement your own Cisco switch using a cheap RedHat box. Google for "virtual ip address" and "cisco content router" for more information.
In addition to trying hardware load-balancers, you can also try a free-open-source load-balancing software application such as HAProxy, available for Linux and the BSDs.