Redeploy/Failover for Glassfish cluster on EC2? - deployment

I have a Tapestry application (WAR, no EJB) that ...
... I want to deploy on 2 EC2 small instances (for failover).
... uses Spring Security
... is stateful (very small session state)
... should be deployed on Glassfish 3.1 (seems to have best cluster support?)
... and has an elastic load balancer with sticky session in front of it
How can I configure a cluster to achieve minimal ('no') interruptions for the user experience in case A) a node fails and B) I deploy a new version?

Everything is explained here: http://download.oracle.com/docs/cd/E18930_01/html/821-2426/docinfo.html#scrolltoc
Basically, you setup a DAS (=master), which control nodes with instances on it. You could do all of this on the same machine (1 DAS, 1 node with multiple instances), although it would be a good idea to have at least 2.
You should then have at least one load balancer (apache, physical load balancer, whatever).
A) if a node fails, the load balancer can redirect all traffic to the other node
B)
deploy application, disabled, with new version (see "application versioning")
mark server A as unavailable
enable the new version on server A
mark server A as available and server B as unavailable
enable the new version on server B
mark server B as avalilable

Related

How to properly deploy with Akka Cluster

Scenario
Right now we only have a single node running the whole system. What we want is to make a distinction between "frontend" nodes, and a single "backend" node.
"Frontend" nodes (N nodes): Maintains a persistent connection with the clients through a WebSocket connection
"Backend" node (1 node): Processes all the requests coming form all the frontend nodes querying to database, and handling the needed domain logic.
This distinction is needed due to some reasons:
Do not reach the limit of 70-100k persistent connections per frontend node
Avoid disconnecting the clients while deploying changes only affecting the backend
Work done
We have connected the actors living on the frontend node with the ones living on the backend. We've done so instantiating the backend node ActorRefs from the frontend using the akka.cluster.singleton.ClusterSingletonProxy, and the ClusterSingletonManager while really instantiating them in the backend.
Question
How we do the deploy taking into account the Akka cluster node downing notification?
As far as I understood by the Akka Cluster documentation about downing, and some comments on the akka mailing list, the recommended approach while dealing with that process would be something like:
Download the akka distribution from http://akka.io/downloads/
Copy and paste the akka-cluster bash script together with the jmxsh-R5.jar on a resources/bin/ folder (for instance)
Include that folder on the distributed package (I've added the following lines on the build.sbt):
mappings in Universal ++=
(baseDirectory.value / "resources" / "bin" * "*" get) map
(bin => bin -> ("bin/" + bin.getName))
While deploying, set the node to be deployed as down manually calling the bash script like:
Execute bin/akka-cluster %node_to_be_deployed:port% down
Deploy the new code version
Execute bin/akka-cluster %deployed_node:port% join
Doubts:
Is this step by step procedure correct?
If the node to be deployed will have the very same IP and port after the deploy, is it needed to make the down and join?
We're planning to set one frontend and the backend nodes as the seed ones. This way, all the cluster could be reconstructed while making a deploy only of the frontend nodes, or only to the backend one. Is it correct?
Thanks!
To avoid downing manually, cleanup when a node is terminated, see:
http://doc.akka.io/docs/akka/current/scala/cluster-usage.html#How_To_Cleanup_when_Member_is_Removed
Regarding your points:
You do not need this procedure when the JVM is restarted and the cleanup code is executed. Only when the cleanup code somehow failed, you need to down manually as described in the procedure.
When the node is marked as removed by other nodes (after the cleanup code is executed), the same ip and port combination can be used to re-join the cluster.
Yes, you can just re-deploy a frontend node.
PS.:
- Coordinated shutdown will be improved in akka 2.5, see:
https://github.com/akka/akka-meta/issues/38
- If you want to manage your cluster using a http API, see: http://developer.lightbend.com/docs/akka-cluster-management/current/

Load balancing in JBoss with mod_cluster

Got a general question about load balancing setup in JBoss (7.1.1.Final). I'm trying to setup a clustered JBoss instance with a master and slave node and I'm using the demo app here (https://docs.jboss.org/author/display/AS72/AS7+Cluster+Howto) to prove the load balancing/session replication. I've basically followed through to just before the 'cluster configuration' section.
I've got the app deployed to the master and slave nodes and if I hit their individual IPs directly I can access the application fine. According to the JBoss logs and admin console the slave has successfully connected to the master. However, if I put something in the session on the slave, take the slave offline, the master cannot read the item that the slave put in the session.
This is where I need some help with the general setup. Do I have to have a separate apache httpd instance sat in front of JBoss to do the load balancing? I thought there was a load balancing capability built into JBoss that wouldn't need the separate server, or am I just completely wrong? If I don't need apache, please could you point me in the direction of instructions to setup the JBoss load balancing?
Thanks.
Yes, you need a Apache or any other software or hardware that allows you to perform load balancing of the HTTP request JBoss Application Server does not provide this functionality.
For proper operation of the session replication you should check that the server configuration and the application configuration is well defined.
On the server must have the cache enabled for session replication (you can use standalone-ha.xml or standalone-full-ha.xml file for initial config).
To configuring the application to replicate the HTTP session is done by adding the <distributable/> element to the web.xml.
You can see a full example in http://blog.akquinet.de/2012/06/21/clustering-in-jboss-as7eap-6/

Prevent deployment to entry node, only deploy to other nodes

I have a free OpenShift account with the default 3 gears. On this I have installed the WildFly 8.1 image using the OpenShift web console. I set the minimal and maximal scaling to 3.
What happens now is that OpenShift will create 3 JBoss WildFly instances:
One on the entry node (which is also running HAProxy)
One on an auxiliary node
One on another auxiliary node
The weird thing is that the JBoss WildFly instance on the entry node is by default disabled in the load balancer config (haproxy.conf). BUT, OpenShift is still deploying the war archive to it whenever I commit in the associated git repo.
What's extra problematic here is that because of the incredibly low number of max user processes (250 via ulimit -u), this JBoss WildFly instance on the entry node cannot even startup. During startup JBoss WildFly will throw random 'java.lang.OutOfMemoryError: unable to create new native thread' (and no, memory is fine, it's the OS process limit).
As a result, the deployment process will hang.
So to summarize:
A JBoss WildFly instance is created on the entry node, but disabled in the load balancer
JBoss WildFly in its default configuration cannot startup on the entry node, not even with a trivial war.
The deployer process attempts to deploy to JBoss WildFly on the entry node, despite it being disabled in the load balancer
Now my question:
How can I modify the deployer process (including the gear start command) to not attempt to deploy to the JBoss WildFly instance on the entry node?
When an app scales from 2 gears to 3, HAproxy stops routing traffic to your application on the headgear and routes it to the two other gears. This assures that HAproxy is getting the most CPU as possible as the application on your headgear (where HAproxy is running) is no longer serving requests.
The out of memory message you're seeing might not be an actual out of memory issue but a bug relating to ulimit https://bugzilla.redhat.com/show_bug.cgi?id=1090092.

How to make restfull service truely Highly Available with Hardware load balancer

When we have a cluster of machines behind a load balancer (lb), generally hardware load balancer have persistent connections,
Now when we need to deploy some update on all machines (rolling update), the way to do is by bringing one machine Out of rotation, looks for no request sent to that server via lb. When the app reached no request state then update manually.
With 70-80 servers in picture this becomes very painful.
Can someone have a better way of doing it.
70-80 servers is a very horizontally scaled implementation... good job! Better is a very relative term, hopefully one of these suggestions count as "better".
Implement an intelligent health check for the application with the ability to adjust the health check while the application is running. What we do is have the health check start failing while the application is running just fine. This allows the load balancer to automatically take the system out of rotation. Our stop scripts query the load balancer to make sure that it is out of rotation and then shuts down normally which allows the existing connections to drain.
Batch multiple groups of systems together. I am assuming that you have 70 servers to handle peak load. This means that you should be able to restart several at a time. A standard way to do this is to implement a simple token granting service with a maximum of 10 tokens. Have your shutdown scripts checkout a token before continuing.
Another way to do this is with blue/green deploys. That means that you have an entire second server farm and then once the second server farm is updated switch load balancing to point to the new server farm.
This is an alternate to option 3. Install both versions of the app on the same servers and then have an internal proxy service (like haproxy) switch the connections between the version of the app that is deployed. For example:
haproxy listening on 8080
app version 0.1 listening on 9001
app version 0.2 listening on 9002
Once you are happy with the deploy of app version 0.2 switch haproxy to send traffic to 9002. When you release version 0.3 then switch load balancing back to 9001 etc.

deploynig different ear files in different clusters of same weblogic server domain

Hi I am new to this forum as well as weblogic server.My requirement is that I have an application that runs on a cluster having an admin server and three managed server MS1,MS2,MS3.Currently my application has two parts(or logic), both of which are in a single ear file.The part1 always occupies one server, say MS1 and rest in other two MS2 & MS3 .I want to divide my code in two different ear part1 and part2 with par1_ear deployed in MS1 and part2_ear deployed in MS2 and MS3 all running under same admin server
ear1 deployed in ----->MS1
ear2 deployed in ----->MS2 &MS3
All running under same managed server.
Can this be done if not other suggestion also welcome but i can have only one admin server and 3 clusters
Yes, when you deploy your .ear files you can target specific machines in a cluster. Your deployments don't have to go to all machines in a cluster.
Also, if you really only want one server in the cluster to handle some specific event you might want to look into Singleton Services.
Have you had experience deploying applications in Weblogic before?