We have a Jboss 5 AS cluster consiteing of 2 nodes using multicast, every thing works fine and the servers are able to discover and make a cluster
but the problem is these servers generate heavy multicast traffic which effects the network performace of other servers shareing the same network.
I am new to Jboss clustering is there any way to use unicast (point-to-point) instead of multicast ? Or configure the multicast such that its not problem for rest of the network ? can you refer me to some documentation , blog post or simmillar that can help me get rid of this problem.
Didn't got any answers here but this might be of help to someone in future we managed to resolve it by
Set the following TTL property for jboss in the start up script
-Djgroups.udp.ip_ttl=1
this will restrict the number of hops to 1 for the multicast messages. This will not reduce the amount of network traffic between the clustered JBoss but will prevent it spreading outside.
If you have other servers in the same subnet that are effected by flooding problem then
you might have to switch to TCP stack and do unicast instead of multicast
-Djboss.default.jgroups.stack=tcp
Also there are more configuration files in jboss deploy for clustering that you should look at.
server/production/deploy/cluster/jboss-cache-manager.sar/META-INF/jboss-cache-manager-jboss-beans.xml
and other conf files in the JGroups config.
If multicast is not an option of for some reason it doesn't work due to network topology we can use the unicast.
To use unicast clustering instead of UDP mcast. Open up your profile and look into file jgroups-channelfactory-stacks.xml and locate the stack named "tcp". That stacks still uses UDP only for multicast discovery. If low UDP traffic is alright, you dont need to change it. If it is or mcast doesn't work, you will need to configure TCPPING protocol and configure intial_hosts where to look for cluster members.
Afterwards, you will need to tell JBoss Cache to use this stack, open up jboss-cache-manager-jboss-beans.xml where for each cache you have a stack defined. You can either change it here from udp to tcp or you can simply use the property when starting AS, just add:
-Djboss.default.jgroups.stack=tcp
Related
I have a Linux machine with 3 network interfaces,
let's say IPs are 192.168.1.101,192.168.1.102,192.168.1.103
I want to consume all 3 IPs of this single node to create a Kafka cluster with other nodes, Should all 3 IPs have their separate brokers?
Also using nic bonding is not recommended, all IPs need to be utilized
Overall, I'm not sure why you'd want to do this... If you are using separate volumes (log.dirs) for each address, then maybe you'd want separate Java processes, sure, but you'd still be sharing the same memory, and having that machine be a single point of failure.
In any case, you can set one process to have advertised.listeners list out each of those addresses for clients to communicate with, however, you'd still have to deal with port allocations in the OS, so you might need to set listeners like so
listeners=PLAINTEXT_1://0.0.0.0:9092,PLAINTEXT_2://0.0.0.0:9093,PLAINTEXT_3://0.0.0.0:9094
And make sure you have listener.security.protocol.map setup as well using those names
Note that clients will only communicate with the leader topic-partition at any time, so if you have one broker JVM process and 3 addresses for it, then really only one address is going to be utilized. One optimization for that could be your intra-cluster replication can use a separate NIC.
I have been testing with haproxy which does cookie based load balancing to our streaming servers, but lets say for example haproxy falls over (I know unlikely)
the streamer gets disconnected, is there a way of passing on the connection without it relying on haproxy, basically laving the streamer connected to the destination and cutting all ties with haproxy.
That is not possible by design.
HAProxy is a proxy (as the name suggests). As such, for each communication, you have two independent TCP-connections, one between the client and HAProxy and another between HAProxy and your backend server.
If HAProxy fails or you need to failover, the standing connections will have to be re-created. You can't pass over existing connections to another server since there is a lot of state attached to each connection that can't be transferred.
If you want to remove the loadbalancer from the equation after the initial connection initialization, you should look at Layer-3 loadbalancing solutions like LVS on Linux with Direct Routing. Note that these solutions are much less flexible than HAProxy. There is no such thing as a free lunch after all :)
The following document says:
This is easier to do and does not require a sysadmin. However, it is not the preferred approach for production systems for the reasons listed above. This approach is usually used in development to try out clustering behavior.
What are risks with this approach in the production environment? In weblogic, it is pretty common, and seen few production environments running with multiple ports(managed servers).
https://community.jboss.org/wiki/ConfiguringMultipleJBossInstancesOnOnemachine
The wiki clearly answers that question. Here is the text from the wiki for your reference
Where possible, it is advised to use a different ip address for each instance of JBoss rather than changing the ports or using the Service Binding Manager for the following reasons:
When you have a port conflict, it makes it very difficult to troubleshoot, given a large amount of ports and app servers.
Too many ports makes firewall rules too difficult to maintain.
Isolating the IP addresses gives you a guarantee that no other app server will be using the ports.
Each upgrade requires that you go in and re set the binding manager again. Most upgrades will upgrade the conf/jboss-service.xml file, which has the Service binding manager configuration in it.
The configuration is much simpler. When defining new ports(either through the Service Binding manager or by going in and changing all the ports in the configuration), it's always a headache trying to figure out which ports aren't taken already. If you use a NIC per JBoss Instance, all you have to change is the Ip address binding argument when executing the run.sh or run.bat. (-b )
Once you get 3 or 4 applications using different ports, the chances really increase that you will step on another one of your applications ports. It just gets more difficult to keep ports from conflicting.
JGroups will pick random ports within a cluster to communicate. Sometimes when clustering, if you are using the same ip address, two random ports may get picked in two different app servers(using the binding manager) that conflict. You can configure around this, but it's better not to run into this situation at all.
On a whole, having an individual IP addresses for each instance of an app server causes fewer problems (some of those problems are mentioned here, some aren't).
I have posted this to ServerFault, but the Node.js community seems tiny there, so I'm hoping this bring more exposure.
I have a Node.js (0.4.9) application and am researching how to best deploy and maintain it. I want to run it in the cloud (EC2 or RackSpace) with high availability. The app should run on HTTPS. I'll worry about East/West/EU full-failover later.
I have done a lot of reading about keep-alive (Upstart, Forever), multi-core utilities (Fugue, multi-node, Cluster), and proxy/load balancers (node-http-proxy, nginx, Varnish, and Pound). However, I am unsure how to combine the various utilities available to me.
I have this setup in mind and need to iron out some questions and get feedback.
Cluster is the most actively developed and seemingly popular multi-core utility for Node.js, so use that to run 1 node "cluster" per app server on non-privileged port (say 3000). Q1: Should Forever be used to keep the cluster alive or is that just redundant?
Use 1 nginx per app server running on port 80, simply reverse proxying to node on port 3000. Q2: Would node-http-proxy be more suitable for this task even though it doesn't gzip or server static files quickly?
Have minimum 2x servers as described above, with an independent server acting as a load balancer across these boxes. Use Pound listening 443 to terminate HTTPS and pass HTTP to Varnish which would round robin load balance across the IPs of servers above. Q3: Should nginx be used to do both instead? Q4: Should AWS or RackSpace load balancer be considered instead (the latter doesn't terminate HTTPS)
General Questions:
Do you see a need for (2) above at all?
Where is the best place to terminate HTTPS?
If WebSockets are needed in the future, what nginx substitutions would you make?
I'd really like to hear how people are setting up current production environments and which combination of tools they prefer. Much appreciated.
It's been several months since I asked this question and not a lot of answer flow. Both Samyak Bhuta and nponeccop had good suggestions, but I wanted to discuss the answers I've found to my questions.
Here is what I've settled on at this point for a production system, but further improvements are always being made. I hope it helps anyone in a similar scenario.
Use Cluster to spawn as many child processes as you desire to handle incoming requests on multi-core virtual or physical machines. This binds to a single port and makes maintenance easier. My rule of thumb is n - 1 Cluster workers. You don't need Forever on this, as Cluster respawns worker processes that die. To have resiliency even at the Cluster parent level, ensure that you use an Upstart script (or equivalent) to daemonize the Node.js application, and use Monit (or equivalent) to watch the PID of the Cluster parent and respawn it if it dies. You can try using the respawn feature of Upstart, but I prefer having Monit watching things, so rather than split responsibilities, I find it's best to let Monit handle the respawn as well.
Use 1 nginx per app server running on port 80, simply reverse proxying to your Cluster on whatever port you bound to in (1). node-http-proxy can be used, but nginx is more mature, more featureful, and faster at serving static files. Run nginx lean (don't log, don't gzip tiny files) to minimize it's overhead.
Have minimum 2x servers as described above in a minimum of 2 availability zones, and if in AWS, use an ELB that terminates HTTPS/SSL on port 443 and communicates on HTTP port 80 to the node.js app servers. ELBs are simple and, if you desire, make it somewhat easier to auto-scale. You could run multiple nginx either sharing an IP or round-robin balanced themselves by your DNS provider, but I found this overkill for now. At that point, you'd remove the nginx instance on each app server.
I have not needed WebSockets so nginx continues to be suitable and I'll revisit this issue when WebSockets come into the picture.
Feedback is welcome.
You should not bother serving static files quickly. If your load is small - node static file servers will do. If your load is big - it's better to use a CDN (Akamai, Limelight, CoralCDN).
Instead of forever you can use monit.
Instead of nginx you can use HAProxy. It is known to work well with websockets. Consider also proxying flash sockets as they are a good workaround until websocket support is ubiquitous (see socket.io).
HAProxy has some support for HTTPS load balancing, but not termination. You can try to use stunnel for HTTPS termination, but I think it's too slow.
Round-robin load (or other statistical) balancing works pretty well in practice, so there's no need to know about other servers' load in most cases.
Consider also using ZeroMQ or RabbitMQ for communications between nodes.
This is an excellent thread! Thanks to everyone that contributed useful information.
I've been dealing with the same issues the past few months setting up the infrastructure for our startup.
As people mentioned previously, we wanted a Node environment with multi-core support + web sockets + vhosts
We ended up creating a hybrid between the native cluster module and http-proxy and called it Drone - of course it's open sourced:
https://github.com/makesites/drone
We also released it as an AMI with Monit and Nginx
https://aws.amazon.com/amis/drone-server
I found this thread researching how to add SSL support to Drone - tnx for recommending ELB but I wouldn't rely on a proprietary solution for something so crucial.
Instead I extended the default proxy to handle all the SSL requests. The configuration is minimal while the SSL requests are converted to plain http - but I guess that's preferable when you're passing traffic between ports...
Feel free to look into it and let me know if it fits your needs. All feedback welcomed.
I have seen AWS load balancer to load balance and termination + http-node-proxy for reverse proxy, if you want to run multiple service per box + cluster.js for mulicore support and process level failover doing extremely well.
forever.js on cluster.js could be good option for extreme care you want to take in terms of failover but that's hardly needed.
I'm looking into WCF/MSMQ.
Does anyone know how one handles redudancy with MSMQ? It is my understanding that the queue sits on the server, but what if the server goes down and is not recoverable, how does one prevent the messages from being lost?
Any good articles on this topic?
There is a good article on using MSMQ in the enterprise here.
Tip 8 is the one you should read.
"Using Microsoft's Windows Clustering tool, queues will failover from one machine to another if one of the queue server machines stops functioning normally. The failover process moves the queue and its contents from the failed machine to the backup machine. Microsoft's clustering works, but in my experience, it is difficult to configure correctly and malfunctions often. In addition, to run Microsoft's Cluster Server you must also run Windows Server Enterprise Edition—a costly operating system to license. Together, these problems warrant searching for a replacement.
One alternative to using Microsoft's Cluster Server is to use a third-party IP load-balancing solution, of which several are commercially available. These devices attach to your network like a standard network switch, and once configured, load balance IP sessions among the configured devices. To load-balance MSMQ, you simply need to setup a virtual IP address on the load-balancing device and configure it to load balance port 1801. To connect to an MSMQ queue, sending applications specify the virtual IP address hosted by the load-balancing device, which then distributes the load efficiently across the configured machines hosting the receiving applications. Not only does this increase the capacity of the messages you can process (by letting you just add more machines to the server farm) but it also protects you from downtime events caused by failed servers.
To use a hardware load balancer, you need to create identical queues on each of the servers configured to be used in load balancing, letting the load balancer connect the sending application to any one of the machines in the group. To add an additional layer of robustness, you can also configure all of the receiving applications to monitor the queues of all the other machines in the group, which helps prevent problems when one or more machines is unavailable. The cost for such queue-monitoring on remote machines is high (it's almost always more efficient to read messages from a local queue) but the additional level of availability may be worth the cost."
Not to be snide, but you kind of answered your own question. If the server is unrecoverable, then you can't recover the messages.
That being said, you might want to back up the message folder regularly. This TechNet article will tell you how to do it:
http://technet.microsoft.com/en-us/library/cc773213.aspx
Also, it will not back up express messages, so that is something you have to be aware of.
If you prefer, you might want to store the actual messages for processing in a database upon receipt, and have the service be the consumer in a producer/consumer pattern.