Can't start SF and this is in the logs:
PS>TerminatingError(Connect-ServiceFabricCluster): "No cluster endpoint is reachable, please check if there is connectivity/firewall/DNS issue."
running on local dev machine. What should I check exactly in the firewall/DNS?
Related
My OKD 4.8 single node deployment has been up for more than a month now. Then today it started acting (pods not being created). So I thought maybe I reboot the node. I shut it down via the AWS console. And then I started it again.
However, after restart, it is not responding. The node is running but OKD is not accessible. Neither the console nor the API can be reached. Any oc command results in "The connection to the server api.api1.hostname.info:6443 was refused - did you specify the right host or port?"
The domain name and all zones are hosted by AWS.
Any troubleshooting ideas?
We've just shipped a standalone service fabric cluster to a customer site with a misconfiguration. Our setup:
Service Fabric 6.4
2 Windows servers, each running 3 Hyper-V virtual machines that host the cluster
We configured the cluster locally using static IP addresses for the nodes. When the servers arrived, the IP addresses of the Hyper-V machines were changed to conform to the customer's available IP addresses. Now we can't connect to the cluster, since every IP in the clusterConfig is wrong. Is there any way we can recover from this without re-installing the cluster? We'd prefer to keep the new IP's assigned to the VM's if possible.
I've tested this only on my test environment (I've never done this on production before so do it on your own risk), but since you can't connect to the cluster anyway I think it is worth to try.
Connect to each virtual machine which is a part of the cluster and do following steps:
Locate Service Fabric Cluster files (usually C:\ProgramData\SF\{nodeName}\Fabric)
Take ClusterManifest.current.xml file and copy it to temp folder (for example C:\temp)
Go to Fabric.Data subfolder, take InfrastructureManifest.xml file and copy it to the same temp folder
Inside each file you have copied change IP addresses for nodes to correct values
Stop FabricHostSvc process by running net stop FabricHostSvc command in powershell
After successful stop run this powershell (admin mode) command to update node cluster configuration:
New-ServiceFabricNodeConfiguration -ClusterManifestPath C:\temp\ClusterManifest.current.xml -InfrastructureManifestPath C:\temp\InfrastructureManifest.xml
Once the config is updated start FabricHostSvc net start FabricHostSvc
Do this for each node and pray for the best.
I am deploying an hyperledger network to an Openshift (Kubernetes) infrastructure. I've already started CA, orderer, and peer0, but using the same yaml configuration that I used to launch peer0 (with obvious changes) to launch peer1, the pod never lauches. Checking peer1 logs I can see the message:
panic: Error while trying to open DB: resource temporarily unavailable.
Any idea of why could this be happening? There's a related question here Hyperledger Fabric "panic: Error while trying to open DB: resource temporarily unavailable" during starting a peer, but the suggestion does not apply to my case, because I'm not running the network in a local machine, but in an openshift environment running kubernetes in the background, and peer0 and peer1 are in different pods.
I'm trying to run the peers with LevelDB (default for HLF)
Versions:
Hyperledger Fabric 1.1
Openshift 3.5.5.31.66
Kubernetes 1.5.2
Update: Problem solved thanks to Gari Singh comment. Peer1 was using a production volume mount pointing to the same directory as Peer0's.
Thanks
That error typically occurs when the peer cannot get a lock on DB files. Make sure that peer0 and peer1 are not mounting the same shared volume.
I've an on-premise, secure, development cluster that I wish to upgrade. The current version is 5.7.198.9494. I've followed the steps listed here.
At the time of writing, the latest version of SF is 6.2.283.9494. However, running Get-ServiceFabricRuntimeUpgradeVersion -BaseVersion 5.7.198.9494 shows that I first must update to 6.0.232.9494, before upgrade to 6.2.283.9494.
I run the following in Powershell, and the upgrade does start:
Copy-ServiceFabricClusterPackage -Code -CodePackagePath .\MicrosoftAzureServiceFabric.6.0.232.9494.cab -ImageStoreConnectionString "fabric:ImageStore"
Register-ServiceFabricClusterPackage -Code -CodePackagePath MicrosoftAzureServiceFabric.6.0.232.9494.cab
Start-ServiceFabricClusterUpgrade -Code -CodePackageVersion 6.0.232.9494 -Monitored -FailureAction Rollback
However, after a few minutes the following happens:
Powershell IDE crashes
The Service Fabric Cluster becomes unreachable
Service Fabric Local Cluster Manager disappears from the task bar
Event Viewer will log the events, see below.
Quite some time later, the vm will reboot. Service Fabric Local Cluster Manager will only give options to Setup or Restart the local cluster.
Event viewer has logs in the under Applications and Services Logs -> Microsoft-Service Fabric -> Operational. Most are information about opening, closing, and aborting one of the upgrade domains. There are some warnings about a vm failing to open an upgrade domain stating error: Lease Failed.
This behavior happens consistently, and I've not yet been able to update the cluster. My guess is that we are not able to upgrade a development cluster, but I've not found an article that states that.
Am I doing something incorrectly here, or is it impossible to upgrade a development cluster?
I will assume you have a development cluster with a single node or multiple nodes in a single VM.
As described in the first section of the documentation from the same link your provided:
service-fabric-cluster-upgrade-windows-server
You can upgrade your cluster to the new version only if you're using a
production-style node configuration, where each Service Fabric node is
allocated on a separate physical or virtual machine. If you have a
development cluster, where more than one Service Fabric node is on a
single physical or virtual machine, you must re-create the cluster
with the new version.
I have installed a Service Fabric unsecured development cluster on a shared, on-premises VM with firewall turned off. I can connect to it locally (on same VM) via PowerShell, and deploy locally via Visual Studio. However I am unable to connect or deploy to the cluster from any other box on our network, getting the following error message from PowerShell:
Connect-ServiceFabricCluster : No cluster endpoint is reachable, please check if there is connectivity/firewall/DNS issue.
As I said, the firewall is turned off on the machine hosting the cluster. What am I doing wrong?
OneBox deployment of Service Fabric (installed via the SDK) does not support remote publishing.
Template for configuring a shared dev/test cluster consisting of three nodes can be found here: https://azure.microsoft.com/en-us/documentation/articles/service-fabric-cluster-creation-for-windows-server/#download-the-service-fabric-standalone-package
/Mikkel