I am going through guide here:
https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-cluster-creation-for-windows-server
Section "Step 1B: Create a multi-machine cluster".
I have installed Cluster on one box and trying to use the same json (as per instructions) and trying to install it on another box so that i can have Cluster running on 2 VMs.
I am now getting this error when I run by TestConfig.ps1:
Previous Fabric installation detected on machine XXX. Please clean the machine.
Previous Fabric installation detected on machine XXX. Please clean the machine.
Data Root node Dev Box1 exists on machine XXX in \XXX\C$\ProgramData\SF\Dev Box1. This is an artifact from a previous installation - please delete the directory corresponding to this node.
First, take a look on this link. These are the requirements for each cluster node that are need to be met if you want to create the cluster.
The error is pretty obvious. You most likely have already SF installed on the machine. So either you have SF runtime or some uncleaned cluster data there.
Your first try should be running CleanFabric powershell script from the SF standalone package on each node. It should clean all SF data (cluster, runtime, registry etc.). Try this and then run TestConfiguration script once again. If this does not help, you would have to go to each node and manually delete any SF data that TestConfiguration script is complaining about.
Related
I am currently configuring a virtual machine to work as an agent within Azure (with Ubuntu as image). In which the additional configuration is running through a cloud init file.
In which, among others, I have the below 'fix' within bootcmd and multiple steps within runcmd.
However the machine already gives the state running within the azure portal, while still running the cloud configuration phase (cloud_config_modules). This has as a result pipelines see the machine as ready for usage while not everything is installed/configured yet and breaks.
I tried a couple of things which did not result in the desired effect. After which I stumbled on the following article/bug;
The proposed solution worked, however I switched to a rhel image and it stopped working.
I noticed this image is not using walinuxagent as the solution states but waagent, so I tried to replacing that like the example below without any success.
bootcmd:
- mkdir -p /etc/systemd/system/waagent.service.d
- echo "[Unit]\nAfter=cloud-final.service" > /etc/systemd/system/waagent.service.d/override.conf
- sed "s/After=multi-user.target//g" /lib/systemd/system/cloud-final.service > /etc/systemd/system/cloud-final.service
- systemctl daemon-reload
After this, also tried to set the runcmd steps to the bootcmd steps. This resulted in a boot which took ages and eventually froze.
Since I am not that familiar with rhel and Linux overall, I wanted to ask help if anyone might have some suggestions which I can additionally try.
(Apply some other configuration to ensure await on the cloud-final.service within a waagent?)
However the machine already had the state running, while still running the cloud configuration phase (cloud_config_modules).
Could you please be more specific? Where did you read the machine state?
The reason I ask is that cloud-init status will report status: running until cloud-init is done running, at which point it will report status: done
I what is the purpose of waiting until cloud-init is done? I'm not sure exactly what you are expecting to happen, but here are a couple of things that might help.
If you want to execute a script "at the end" of cloud-init initialization, you could put the script directly in runcmd, and if you want to wait for cloud-init in an external script you could do cloud-init status --wait, which will print a visual indicator and eventually return once cloud-init is complete.
On not too old Azure Linux VM images, cloud-init rather than WALinuxAgent acts as the VM provisioner. The VM is marked provisioned by the Azure cloud-init datasource module very early during cloud-init processing (source), before any cloud-init modules configurable with user data. WALinuxAgent is only responsible for provisioning Azure VM extensions. It does not appear to be possible to delay sending the 'VM ready' signal to Azure without modifying the VM image and patching the source code of cloud-init Azure datasource.
I want to understand why OpenStack needs to be there on the Director machine on the Redhat Openstack Platform?
Are we going to create VMs on the Director machines as well? I understand the Director machine is used to deploy Overcloud, but that can be achieved by some simple package and without installing the whole Openstack on that single machine.
Yeah, that's called TripleO (Openstack-on-Openstack). The Openstack on the Director is only a minimal Openstack. Basically it contains Keystone, Nova, Neutron, Glance, Heat and Ironic.
The director doesn't create VMs. Instead of VMs, the Nova on the director use Ironic to provide baremetal machines for the Overcloud. After Ironic is triggered by Nova, it used Images, which already contains prepared Openstack components, from Glance and copy these images over PXE on the new baremetal-nodes. After this, Heat is triggered to run Heat-templates and Ansible-Playbooks to configure these new deployed baremetal-nodes and create a running Openstack Overclound installation.
It is basically the intention that you only have the install one single director node by hand and all other multi-node Overclouds via the TripleO automation. Beside this it should make the Deployments more scalable, because you could add new Compute-nodes to the Overcloud in the same way like when you create a normal new VM within your Openstack deployment and it should make a version-upgrade of the Overcloud easier.
I have setup an ESB cluster using jdbc connections to ms sql databases for local and remotely mounted config and gov registries. 1x mgt and 2xworker
Our .car file contains some ws-security policy artifacts which go to config. When I deploy to mgt it deploys OK. I have SVN dep sync setup to the cluster and when it picks up the .car it starts to deploy on the worker but fails when loading the policy files into conf. It is trying to duplicate the policy in the shared conf and fails - of course that is right but; how should I deploy these 'shared' artifacts when a .car file is distributed by svn? I need to be able to control the deploy properly. The only way I can see is via the dev studio which is terrible for our change management practice.
Thanks for you help.
I can recommend multiple solutions. You can decide what to choose from them.
Since you have only 2 worker nodes, you can get rid of (disable) deployment synchronization and deploy the car files to all the nodes. I believe you have some automated process, so it wont be a problem to deploy to all nodes. While doing so, modify your project to bundle the policies to a separate car file and the services to another. When deploying, you deploy the policies only to management node and the services to all nodes.
Second option is to, add the policies to local registry. i.e. Not the config registry, not the governance registry. Then, when you deploy the car to the management node, it will add the policies to local registry of the management node. When the car file is dep-synced, worker nodes will deploy them and they will add the policies to their local registry. This will avoid the worker nodes trying to add the policies to the same location.
By going through the question, I felt you have external databases to the local registry too. But, its not necessary. You can use the internal H2 database for the local registry. H2 databases sometimes get corrupted. If such a thing happens, all you have to do is, delete the H2 database and restart the server with -Dsetup option. Having an external DB is fine. But, thats an overkill.
I have configured a Soa Cluster with one admin node and two managed nodes and all server nodes configured in three different machines. once I deploy a Bpell to one managed node it automatically deploys in the other managed nodes as well(default behavior). once you go to soa enterprise manager those deployed Bpels can be viewed under [Soa -> managed node -> Defult ->..]. It is the same place where we deploy new Bpels. I accidentally undeploy all bpels (you can do it by right clicking a managed node and choosing un-deploy option).
Now I'm having a hard time to get back to previous state, how to deploy all those projects again to a specific managed node. I tried to restart the node hoping it would sync again, yet the managed server went to "admin" state (not the ok state).
is there anything needs to be done !!
Thanks, Hemal
You'll need to start the server from command line, it will work.
For managing 'managed servers' from EM or WLS console, there's one additional step that's needed during instalation process.
Please modify the nodemanager.properties of WLS and set the property startscriptenabled=true.
http://download.oracle.com/docs/cd/E12839_01/core.1111/e10105/start.htm#CIHBACFI
Environment :
Java EE webApp
JDK: 1.6,
AS: Websphere app server 7,
OS:redhatzLinux
I am not a websphere admin and I am asked to develop a way or a script to solve the issue below:
I have a cluster with three nodes NodeA NodeB and NodeC. My application runs on these clusters. I want to deploy my application on these nodes such that i dont need to bring all of them down at once. These days the deployments is done this way : we come at night to stop all the servers all at once from console. Then we install the application on the main node which is on the same machine as the deployment manager and then we synchronize and bring all the servers back up one by one.
What I am asked to do is that we upgrade the application or install the new ear file by not bringing everything down as this is causing downtime to the application. Is there a way to acheive this. WAS 7 is a very mature product i am sure there must be a way to do it.
I looked at the documentation/tutorial we can do something like "Update" where we select the application (from Apllications> websphere enterprise application)and select update and then select radio button "Replace Entire Application" and radio button"local file system" and point to the new ear file. But in that case the doc says that it will bring down all the servers as well when updating. its the same as before. no online deployment.
I am a java programmer so I thought of using what tools I have to solve this
Tell me if this is can be an issue :
1) We bring down NODEA
2) We remove the NODEA from the cluster (by pressing remove node button or using the removeNode.sh)
3) Install the new Ear on the NODEA (can we do this in the same admin console? or through shell script or jython or may be like a standalone server)
3) We then start it up back again and then add it to cluster.
NOW we have NODEA with new applicaition while NODE B and NODEC are with old application versions.
Then we bring down NODEB
remove NODEB from cluster
install applciation on NODEB
start it up again
Add it back to cluster
NOW we have two nodes with new application and NODEC with old
we try the same process for NODEC.
Will this work. Has any one tried this. what issues can you think of that can happen.
I will so appreciate any feedback from here. I am sure there are experienced ppl on this forum. I dont think this is a rare issue,i believe this is something any organization would want with High Availability requirements.
Thanks for any help in advance.
Syed...
This is a possible duplicate of How can i do zero down time deployment on cluster environment?. Here is essentially my answer from that question:
After updating the application, you can utilize the "Rollout Update" feature. Rather than saving and synchronizing the nodes after updating, you can use this feature which automatically performs the following tasks to enable the changes to propagate to all deployment targets while maintaining high availability (assuming you have a horizontal cluster, such that cluster members exist on multiple nodes, which it sounds like you do):
Save session changes to the master configuration
For each node in the cluster (one at a time, to enable continuous availability):
Stop the cluster members on the node
Synchronize the node
Start the application servers (which automatically starts the application)
Alternatively, you can follow the following procedure.
Stop all nodeagents except Node A.
Comment out or disable the Node A from Load Balancer or Plugin (So the traffic will not come to the node)
Deploy the application.
Changes will be synchronized only on Node A as its nodeagent is up.
Uncomment/enable the Node A from plugin / load balancer.
Comment/disable Node B from plugin/load balancer to stop incomming traffic on the node.
Start the nodeagent of Node B so it will synchronize the file changes on the Node. The ear application will stop and start after synchronization.
Uncomment/enable the Node B from plugin / load balancer.
Repeat steps 6,7,8 for all the remaining nodes.
Regards,
Laique Ahmed