Thoughts on How to Hotdeploy using JBoss 5 - jboss

I am trying to see if this is possible. I will first you you a background of how the application currently runs.
The application is deployed to 4 separate nodes (using the 'all' config). 2 nodes on ServerA and 2 nodes on ServerB named node1, node2, node3, node4.
The application is behind a web server running apache and mod_jk to redirect traffic.
Assume that version 1.0.0 is currently deployed.
I will be trying to deploy 1.0.1 which will only have a minor change.
The goal will be to take down node4, deploy version 1.0.1 to node4 (while node1-node3 are still up and running).
They will be sharing the same database which in theory should be fine as long as our code doesn't require us to update anything in our database.
The next step would be to direct traffic using apache + mod_jk to only load balance node1-node3. node4 will be accessed directly.
node4 will be tested running version 1.0.1.
Apache + mod_jk will be changed to serve node4.
Version 1.0.1 will be deployed to node1-node3.
All nodes should now be running version 1.0.1.
I know this is extremely high level and I am already facing problems (not to mention application specific problems).
I just want to know what are other ways of approaching this or JBoss specific problems that I can run into.
Should I be putting the hotdeploy node in a different cluster and have the rest join later?
Any suggestions would help. Thanks.

You can take advantage of your Apache with mod_jk in front, imagine you have in your configuration something like:
JkMount /myapp/* workerApp
JkWorkersFile /etc/httpd/conf/workerApp.properties
Well instead of having a file named workerApp.properties use these 3 files:
workerApp-deploy1.properties
Will contain configuration to connect only to node 4
workerApp-deploy2.properties
Will contain configuration to connect only to nodes 1,2 and 3
workerApp-normal.properties
This will be your actual workers file
Now wokerApp.properties instead of being a file is a link, so on normal cirscunstances:
ln -s workerApp-normal.properties workerApp.properties
When you deploy a new version
rm -f workerApp.properties
ln -s workerApp-deploy2.properties workerApp.properties
reload apache
Now you can deploy on node4 the new version and all request will route through node1,2 and 3. When deploy on node 4 is ready:
rm -f workerApp.properties
ln -s workerApp-deploy1.properties workerApp.properties
reload apache
In this situation all clients will be router to node 4 and you can upgrade versions on other nodes. When you're done:
rm -f workerApp.properties
ln -s workerApp-normal.properties workerApp.properties
reload apache
And you get all request balanced between servers.
This have another advantage, for example you can define a VirtualHost like preflighttest.yourcompany.com using a different set of workers, so you can test your new version on node 4 before effectivily rolling it in production.
Hope it helps.

Related

How can I fix ceph commands hanging after a reboot?

I'm pretty new to Ceph, so I've included all my steps I used to set up my cluster since I'm not sure what is or is not useful information to fix my problem.
I have 4 CentOS 8 VMs in VirtualBox set up to teach myself how to bring up Ceph. 1 is a client and 3 are Ceph monitors. Each ceph node has 6 8Gb drives. Once I learned how the networking worked, it was pretty easy.
I set each VM to have a NAT (for downloading packages) and an internal network that I called "ceph-public". This network would be accessed by each VM on the 10.19.10.0/24 subnet. I then copied the ssh keys from each VM to every other VM.
I followed this documentation to install cephadm, bootstrap my first monitor, and added the other two nodes as hosts. Then I added all available devices as OSDs, created my pools, then created my images, then copied my /etc/ceph folder from the bootstrapped node to my client node. On the client, I ran rbd map mypool/myimage to mount the image as a block device, then used mkfs to create a filesystem on it, and I was able to write data and see the IO from the bootstrapped node. All was well.
Then, as a test, I shutdown and restarted the bootstrapped node. When it came back up, I ran ceph status but it just hung with no output. Every single ceph and rbd command now hangs and I have no idea how to recover or properly reset or fix my cluster.
Has anyone ever had the ceph command hang on their cluster, and what did you do to solve it?
Let me share a similar experience. I also tried some time ago to perform some tests on Ceph (mimic i think) an my VMs on my VirtualBox acted very strange, nothing comparing with actual bare metal servers so please bare this in mind... the tests are not quite relevant.
As regarding your problem, try to see the following:
have at least 3 monitors (or an even number). It's possible that hang is because of monitor election.
make sure the networking part is OK (separated VLANs for ceph servers and clients)
DNS is resolving OK. (you have added the servername in hosts)
...just my 2 cents...

offline hyperledger fabric setup on Redhat 7.0

is there any way we can get the hyperledger fabric binaries or build those from the source code as our machines are behind the firewalls. I am not able to run
curl -sSL goo.gl/byy2Qj | bash -s 1.0.5
which use the following commands
docker pull hyperledger/fabric-$IMAGES:$FABRIC_TAG
docker tag hyperledger/fabric-$IMAGES:$FABRIC_TAG hyperledger/fabric-$IMAGES
docker hub is blocked and external images are not allowed to download.
I believe this is the issue with most of the enterprises whose systems are behind the firewalls are provided restricted access for docker as well.
Just download the binaries directly for orderer and peers (and configtx, etc.) from this line in the script at goo.gl/byy2Qj. Just browse manually to find your flavor and release.
echo "===> Downloading platform binaries"
curl https://nexus.hyperledger.org/content/repositories/releases/org/hyperledger/fabric/hyperledger-fabric/${ARCH}-${VERSION}/hyperledger-fabric-${ARCH}-${VERSION}.tar.gz | tar xz
You may still have to clone and install the CA server, and install CouchDB, and Postgres, and Kafka and Zookeeper, etc., depending on how you want to set things up.
And you can always clone the main Fabric repo and make the binaries yourself.
You can then run them without docker (note: the cc container needs Docker available, but no images) or modify the docker scripts and create your own containers.
This page in the docs gives some good clues if you want to make yourself. You really only need to make peer and orderer but you can do make dist-clean all. Making all can take 45 min to 1 hour. You don't have to make and run any of the tests. And don't use vagrant.
https://hyperledger-fabric.readthedocs.io/en/release/dev-setup/build.html

How to properly deploy with Akka Cluster

Scenario
Right now we only have a single node running the whole system. What we want is to make a distinction between "frontend" nodes, and a single "backend" node.
"Frontend" nodes (N nodes): Maintains a persistent connection with the clients through a WebSocket connection
"Backend" node (1 node): Processes all the requests coming form all the frontend nodes querying to database, and handling the needed domain logic.
This distinction is needed due to some reasons:
Do not reach the limit of 70-100k persistent connections per frontend node
Avoid disconnecting the clients while deploying changes only affecting the backend
Work done
We have connected the actors living on the frontend node with the ones living on the backend. We've done so instantiating the backend node ActorRefs from the frontend using the akka.cluster.singleton.ClusterSingletonProxy, and the ClusterSingletonManager while really instantiating them in the backend.
Question
How we do the deploy taking into account the Akka cluster node downing notification?
As far as I understood by the Akka Cluster documentation about downing, and some comments on the akka mailing list, the recommended approach while dealing with that process would be something like:
Download the akka distribution from http://akka.io/downloads/
Copy and paste the akka-cluster bash script together with the jmxsh-R5.jar on a resources/bin/ folder (for instance)
Include that folder on the distributed package (I've added the following lines on the build.sbt):
mappings in Universal ++=
(baseDirectory.value / "resources" / "bin" * "*" get) map
(bin => bin -> ("bin/" + bin.getName))
While deploying, set the node to be deployed as down manually calling the bash script like:
Execute bin/akka-cluster %node_to_be_deployed:port% down
Deploy the new code version
Execute bin/akka-cluster %deployed_node:port% join
Doubts:
Is this step by step procedure correct?
If the node to be deployed will have the very same IP and port after the deploy, is it needed to make the down and join?
We're planning to set one frontend and the backend nodes as the seed ones. This way, all the cluster could be reconstructed while making a deploy only of the frontend nodes, or only to the backend one. Is it correct?
Thanks!
To avoid downing manually, cleanup when a node is terminated, see:
http://doc.akka.io/docs/akka/current/scala/cluster-usage.html#How_To_Cleanup_when_Member_is_Removed
Regarding your points:
You do not need this procedure when the JVM is restarted and the cleanup code is executed. Only when the cleanup code somehow failed, you need to down manually as described in the procedure.
When the node is marked as removed by other nodes (after the cleanup code is executed), the same ip and port combination can be used to re-join the cluster.
Yes, you can just re-deploy a frontend node.
PS.:
- Coordinated shutdown will be improved in akka 2.5, see:
https://github.com/akka/akka-meta/issues/38
- If you want to manage your cluster using a http API, see: http://developer.lightbend.com/docs/akka-cluster-management/current/

Mesos cluster does not recover when physical host restart

I'm using mesosphere on 3 host over Ubuntu 14.04 as follow:
one with mesos master
two with mesos slave
All work fine, but after restart all physical hosts all scheduled job was lost. It's normal? I'm expected that zookeeper will store the current jobs, then when the system will need restart it, all jobs will be rescheduled after the master boot.
Update:
I'm using marathon and mesos on a same node, and I'm run marathon with flag --zk
With marathon's --zk and --ha enabled, Marathon should be storing its state in ZK and recovering it on restart, as long as Mesos allows it to reregister with the same framework ID.
However, you'll also need to enable the Mesos registry (even for a single master), to ensure that Mesos persists information about what frameworkIds are registered in the event of master failover. This can be accomplished by setting the --registry=replicated_log (default), --quorum=1 (since you only have 1 master), and --work_dir=/path/to/registry (where to store the state).
I solved the problem following this installation instructions: How To Configure a Production-Ready Mesosphere Cluster on Ubuntu 14.04
Although you found a solution, I'd like to explain more to this issue:)
In official doc:http://mesos.apache.org/documentation/latest/slave-recovery/
Note that if the operating system on the slave is rebooted, all
executors and tasks running on the host are killed and are not
automatically restarted when the host comes back up.
So all frameworks on Mesos will be killed after reboot. One way to restart the frameworks is to run all frameworks on Marathon, which will manage other frameworks and restart them in need.
However, then you need to auto-restart Marathon when it's killed. In the digitialocean link you mentioned, the Marathon is installed with script in init.d, so it can be restarted after rebooted. Otherwise, if you installed the Marathon via source code, you can use tools like supervisord to monitor Marathon.

Deployment in IBM Websphere 7 cluster with nodes with High availability

Environment :
Java EE webApp
JDK: 1.6,
AS: Websphere app server 7,
OS:redhatzLinux
I am not a websphere admin and I am asked to develop a way or a script to solve the issue below:
I have a cluster with three nodes NodeA NodeB and NodeC. My application runs on these clusters. I want to deploy my application on these nodes such that i dont need to bring all of them down at once. These days the deployments is done this way : we come at night to stop all the servers all at once from console. Then we install the application on the main node which is on the same machine as the deployment manager and then we synchronize and bring all the servers back up one by one.
What I am asked to do is that we upgrade the application or install the new ear file by not bringing everything down as this is causing downtime to the application. Is there a way to acheive this. WAS 7 is a very mature product i am sure there must be a way to do it.
I looked at the documentation/tutorial we can do something like "Update" where we select the application (from Apllications> websphere enterprise application)and select update and then select radio button "Replace Entire Application" and radio button"local file system" and point to the new ear file. But in that case the doc says that it will bring down all the servers as well when updating. its the same as before. no online deployment.
I am a java programmer so I thought of using what tools I have to solve this
Tell me if this is can be an issue :
1) We bring down NODEA
2) We remove the NODEA from the cluster (by pressing remove node button or using the removeNode.sh)
3) Install the new Ear on the NODEA (can we do this in the same admin console? or through shell script or jython or may be like a standalone server)
3) We then start it up back again and then add it to cluster.
NOW we have NODEA with new applicaition while NODE B and NODEC are with old application versions.
Then we bring down NODEB
remove NODEB from cluster
install applciation on NODEB
start it up again
Add it back to cluster
NOW we have two nodes with new application and NODEC with old
we try the same process for NODEC.
Will this work. Has any one tried this. what issues can you think of that can happen.
I will so appreciate any feedback from here. I am sure there are experienced ppl on this forum. I dont think this is a rare issue,i believe this is something any organization would want with High Availability requirements.
Thanks for any help in advance.
Syed...
This is a possible duplicate of How can i do zero down time deployment on cluster environment?. Here is essentially my answer from that question:
After updating the application, you can utilize the "Rollout Update" feature. Rather than saving and synchronizing the nodes after updating, you can use this feature which automatically performs the following tasks to enable the changes to propagate to all deployment targets while maintaining high availability (assuming you have a horizontal cluster, such that cluster members exist on multiple nodes, which it sounds like you do):
Save session changes to the master configuration
For each node in the cluster (one at a time, to enable continuous availability):
Stop the cluster members on the node
Synchronize the node
Start the application servers (which automatically starts the application)
Alternatively, you can follow the following procedure.
Stop all nodeagents except Node A.
Comment out or disable the Node A from Load Balancer or Plugin (So the traffic will not come to the node)
Deploy the application.
Changes will be synchronized only on Node A as its nodeagent is up.
Uncomment/enable the Node A from plugin / load balancer.
Comment/disable Node B from plugin/load balancer to stop incomming traffic on the node.
Start the nodeagent of Node B so it will synchronize the file changes on the Node. The ear application will stop and start after synchronization.
Uncomment/enable the Node B from plugin / load balancer.
Repeat steps 6,7,8 for all the remaining nodes.
Regards,
Laique Ahmed