We have a service in our Azure Service Fabric Cluster which seems to be freezing and losing connection with our database about once every 48 hours. Until I get a developer to look into the issue, my workaround has been to go and Delete the service via the Service Fabric Explorer and then immediately re-create it. This fixed the issue temporarily until it freezes up again.
My question is whether there is anyway I can automate this process? It will be at least a month or two before I can get a developer to look into it so I'm looking for away to run the process automatically once a day.
Unfortunately there is no scheduling mechanism in Service Fabric to do this kind of operation.
A solution for you is running a script that connects to SF and execute the Restart Code Package either via powershell or via API.
For powershell you can use an Azure Automation Runbook or use a Azure Functions to call the API in a schedule.
I think the powershell is easier, but both should work.
Restart-ServiceFabricDeployedCodePackage, as the name suggest, will force a shutdown and restart a process and all replicas hosted within it. No need to delete and recreate, you might miss configure the service.
The documentation shows a combination of parameters that can be used together, in some cases, some parameters is required when used with others, the docs should highlight the matches, the result will be something like this:
Restart-ServiceFabricDeployedCodePackage -ApplicationName "fabric:/appname" -ServiceName "fabric:/appname/servicename" -PartitionId "b098c9f0-009a-458d-8b2d-8089fedcd014"
or a specific replica like this:
Restart-ServiceFabricDeployedCodePackage -ApplicationName "fabric:/repairs" -ServiceName "fabric:/repairs/web" -PartitionId "b098c9f0-009a-458d-8b2d-8089fedcd014" -ReplicaOrInstanceId 131896982398426643
.
The Restart-ServiceFabricPartition is also useful, have the same effects:
Restart-ServiceFabricPartition -RestartPartitionMode AllReplicasOrInstances -ServiceName "fabric:/appname/service" -PartitionId "b098c9f0-009a-458d-8b2d-8089fedcd014"
Restart-ServiceFabricPartition has become obsolete to move people to use Start-ServiceFabricPartitionRestart that is recommended for shutdown services when reliability is required, for example stateful services, it will avoid put down all replicas at same time.
Start-ServiceFabricPartitionRestart I haven't used it myself, but is what the recommendation suggests for stateful services.
The parameters combination is a bit tricky, I recommend you try with different combinations. In some cases it succeed but shows an error, not sure why!
Here's my script that I use to restart stateless services. It uses the Restart-ServiceFabricDeployedCodePackage mentioned above.
I've called it Restart-ServiceFabricServiceCodePackages.ps1 and you can just call it with a -ServiceName fabric:/application/service and adjust the wait time according to your service's normal start-up time.
Param (
[Parameter(Mandatory=$true)]
[uri] $ServiceName,
[int] $WaitBetweenNodesSeconds = 30
)
try {
Test-ServiceFabricClusterConnection | Out-Null
}
catch {
throw "Active connection to Service Fabric cluster required"
}
$serviceDescription = Get-ServiceFabricServiceDescription -ServiceName $ServiceName -ErrorAction SilentlyContinue
if (!$serviceDescription) {
throw "Invalid Service Fabric service name"
}
if ($serviceDescription.ServiceKind -ne "Stateless") {
throw "Unknown outcomes could occur for non-stateless services"
}
$applicationName = $serviceDescription.ApplicationName
$serviceTypeName = $serviceDescription.ServiceTypeName
$service = Get-ServiceFabricService -ServiceName $ServiceName -ApplicationName $applicationName
$application = Get-ServiceFabricApplication -ApplicationName $applicationName
$serviceType = Get-ServiceFabricServiceType -ServiceTypeName $serviceTypeName -ApplicationTypeName $application.ApplicationTypeName -ApplicationTypeVersion $application.ApplicationTypeVersion
$serviceManifestName = $serviceType.ServiceManifestName
$nodes = Get-ServiceFabricNode -StatusFilter Up
$nodes | Where-Object {
$nodeName = $_.NodeName
$hasService = $null
$hasApplication = Get-ServiceFabricDeployedApplication -NodeName $nodeName -ApplicationName $applicationName
if ($hasApplication) {
$hasService = Get-ServiceFabricDeployedServicePackage -NodeName $nodeName -ApplicationName $applicationName -ServiceManifestName $serviceManifestName -ErrorAction SilentlyContinue
}
return $hasApplication -and $hasService
} | ForEach-Object {
$nodeName = $_.NodeName
$codePackages = Get-ServiceFabricDeployedCodePackage -NodeName $nodeName -ApplicationName $applicationName -ServiceManifestName $serviceManifestName
$codePackages | ForEach-Object {
$codePackageName = $_.CodePackageName
$servicePackageActivationId = $_.ServicePackageActivationId
$codePackageInstanceId = $_.EntryPoint.CodePackageInstanceId
Write-Host "Restarting deployed package on $nodeName named $codePackageName (for service package id: $servicePackageActivationId and code package id: $codePackageInstanceId)"
$success = Restart-ServiceFabricDeployedCodePackage -NodeName $nodeName -ApplicationName $applicationName -ServiceManifestName $serviceManifestName -CodePackageName $codePackageName -CodePackageInstanceId $codePackageInstanceId -ServicePackageActivationId $servicePackageActivationId -CommandCompletionMode Invalid
if ($success) {
Write-Host "Successfully restored deployed package on $nodeName" -ForegroundColor Green
}
Write-Host "Waiting for $WaitBetweenNodesSeconds seconds for previous node to restart before continuing"
Start-Sleep -Seconds $WaitBetweenNodesSeconds
$retries = 0
$service = Get-ServiceFabricService -ServiceName $ServiceName -ApplicationName $applicationName
while ($retries -lt 3 -and ($service.HealthState -ne "Ok" -or $service.ServiceStatus -ne "Active")) {
$service = Get-ServiceFabricService -ServiceName $ServiceName -ApplicationName $applicationName
$retries = $retries + 1
Write-Host "Waiting for an additional 15 seconds for previous node to restart before continuing because service state is not healthy"
Start-Sleep -Seconds 15
}
}
}
Related
New to Powershell, My goal is to go through a list of remote Computers and check to see if certain services are running on them and starting the services if they are not. what would be the best approach in creating a variable for the services on said servers?
Server1.txt - 'ServiceA ServiceB ServiceC'
Server2.txt - 'ServiceD ServiceE ServiceF'
Server3.txt - 'ServiceG ServiceH'
$services = get-content .\Server1.txt
$services | ForEach {
try {
Write-Host "Attempting to start '$($.DisplayName)'"
Start-Service -Name $.Name -ErrorAction STOP
Write-Host "SUCCESS: '$($.DisplayName)' has been started"
} catch {
Write-output "FAILED to start $($.DisplayName)"
}
}
Thank you.
In your input, you have mentioned one text file for each server which is not advisable. Also there is no computer name in your Start-service Command. Please find my input sample below.
server1-serviceA,ServiceB,ServiceC
server2-serviceD,ServiceE,ServiceF
server3-serviceG,ServiceH,ServiceI
And here is the powershell script, since you have mentioned different services for each server there is a need for using split function.
$textFile = Get-Content C:\temp\servers.txt
foreach ($line in $textFile) {
$computerName = $line.split("-")[0] #Getting computername by using Split
$serviceNames = $line.split("-")[1] #Getting Service names by using split
foreach ($serviceName in $serviceNames.split(",")) {
# Again using split to handle multiple service names
try {
Write-Host " Trying to start $serviceName in $computerName"
Get-Service -ComputerName $computerName -Name $serviceName | Start-Service -ErrorAction Stop
Write-Host "SUCCESS: $serviceName has been started"
}
catch {
Write-Host "Failed to start $serviceName in $computerName"
}
}
}
I haven't tested the script for starting the service, but the loop works properly for multiple servers and their respective services. Thanks!
I have a Powershell script that enumerates running services and their current state using Get-WmiObject Win32_Service. Initial version based on this one and then modified for Azure. When I run the script in Powershell (without the azure automation parts) on my location machine it works fine and I can connect to all the machines of interest, but when I port it to a runbook i get the following error: "Get-WmiObject : The RPC server is unavailable."
Q: Is the problem with permissions for the automation account? If so, what account should I add to the local machines to resolve the issue?
Q: Is Get-WmiObject not a valid way to initiate the connection? If not, what should I try instead?
The code I'm using is below:
[CmdletBinding(SupportsShouldProcess = $true)]
param(
# Servers to check
[Parameter(Mandatory=$true)][string[]]$ServerList,
# Services to check for
[Parameter(Mandatory=$true)][string[]]$includeService
)
# Following modifies the Write-Verbose behavior to turn the messages on globally for this session
$VerbosePreference = "Continue"
$connectionName = "AzureRunAsConnection"
# retry
$retry = 6
$syncOk = $false
$servicePrincipalConnection = Get-AutomationConnection -Name $connectionName
do
{
try
{
Add-AzureRmAccount -ServicePrincipal -TenantId $servicePrincipalConnection.TenantId -ApplicationId $servicePrincipalConnection.ApplicationId -CertificateThumbprint $servicePrincipalConnection.CertificateThumbprint
$syncOk = $true
}
catch
{
$ErrorMessage = $_.Exception.Message
$StackTrace = $_.Exception.StackTrace
Write-Warning "Error during sync: $ErrorMessage, stack: $StackTrace. Retry attempts left: $retry"
$retry = $retry - 1
Start-Sleep -s 60
}
} while (-not $syncOk -and $retry -ge 0)
Select-AzureRMSubscription -SubscriptionId $SubscriptionId -TenantId $servicePrincipalConnection.TenantId
$currentSubscription = Get-AzureRMSubscription -SubscriptionId $SubscriptionId -TenantId $servicePrincipalConnection.TenantId
Set-AzureRmContext -SubscriptionId $SubscriptionId;
$props=#()
[System.Collections.ArrayList]$unreachableServers = #()
Foreach($ServerName in ($ServerList))
{
try
{
$service = Get-WmiObject Win32_Service -ComputerName $servername
}
catch
{}
if ($Service -ne $NULL)
{
foreach ($item in $service)
{
#$item.DisplayName
Foreach($include in $includeService)
{
#write-host $include
if(($item.name).Contains($include) -eq $TRUE)
{
$props += [pscustomobject]#{
servername = $ServerName
name = $item.name
Status = $item.Status
startmode = $item.startmode
state = $item.state
serviceaccount=$item.startname
DisplayName =$item.displayname}
}
}
}
}
else
{
Write-host "Failed to contact server: "$ServerName
$unreachableServers.Add($ServerName)
}
}
$props | Format-Table Servername,Name,startmode,state,serviceaccount,displayname -AutoSize
I am assuming that you are using the Azure Automation Hybrid Worker functionality. Be default it runs under the System account. However you can use a different account to run the runbook under. This is documented here: Azure Automation Hybrid Worker; Look under the RunAs account section. Use the same account that works when you try it directly.
have you considered using OMS? this sounds like a better thing to do.
Anyway, to answer your questions, I would probably create a local user and create a PS configuration endpoint for that user to connect to, and connect impersonating that user from the Automation Account, but again, I wouldn't even go this route, I'd rather use OMS.
I'm trying to swap servers using Move-AzureDeployment cmdlet by running it in my powershell. It seems to take around 4 mins for it to swap from staging to production. Thats 4 mins of downtime and it's not really acceptable. When I swap the servers manually from the Azure Portal it happens almost instantaneously.
I was wondering why it takes longer using the cmdlet and what can I do to fix this issue because I want to be able to swap my Staging and Production Servers using powershell.
Here is my powershell Script:
try
{
$ErrorActionPreference = "stop"
Write-Host "Deploying build build no. $env:build_number to $_serviceName"
#import azure cmdlets module
Write-Host "Importing azure service management modules (i.e for the old portal)"
Import-Module "C:\Program Files (x86)\Microsoft SDKs\Azure\PowerShell\ServiceManagement\Azure\Azure.psd1"
Write-Host "Started Command for Switching Slots"
#Switch slots from Staging to Production
Move-AzureDeployment -ServiceName $_serviceName
Write-Host "Finished Command for Switching Slots"
#make sure deployment is in running state
$deployment = Get-AzureDeployment -servicename $_serviceName -slot $_slotName
Write-Host "$_serviceName is in state $($deployment.status)"
$StopWatch = [System.Diagnostics.Stopwatch]::StartNew() #declare stopwatch
while (($deployment.Status -ne "running") -and ($StopWatch.Elapsed.Hours -lt 2)) #running the loop for a maximum of 2 hours
{
Write-Host "wait 5 seconds before trying again"
Start-Sleep -s 5
$deployment = Get-AzureDeployment -servicename $_serviceName -slot $_slotName
Write-Host "$_serviceName is in state $($deployment.status)"
}
#make sure all roles are in ready state
$nonReadyInstances = (Get-AzureDeployment $_serviceName -Slot $_slotName).RoleInstanceList | Where-Object { $_.InstanceStatus -ne "ReadyRole" } | ft -Property RoleName, InstanceName, InstanceStatus
$nonReadyInstances
$StopWatch = [System.Diagnostics.Stopwatch]::StartNew() #declare stopwatch
while (($nonReadyInstances -ne $null) -and ($StopWatch.Elapsed.Hours -lt 2)) #running the loop for a maximum of 2 hours
{
Write-Host "wait 5 seconds before trying again"
Start-Sleep -s 5
$nonReadyInstances = (Get-AzureDeployment $_serviceName -Slot $_slotName).RoleInstanceList | Where-Object { $_.InstanceStatus -ne "ReadyRole" }
$nonReadyInstances
}
#output deployment id
#$deploymentid = Check-Deployment -serviceName $_serviceName -slotName $_slotName
#Write-Host "Deployed to $_serviceName with deployment id $deploymentid and slot $_slotName"
exit 0
}
catch [System.Exception]
{
Write-Host $_.Exception.ToString()
exit 1
}
From my understanding a the swap between staging slot and production is a VIP swap as they are both running. Microsoft says you shouldn't see any downtime on the site/app itself. Requests will still hit the old production server until the change is complete. Are you seeing downtime on the site during the swap?
I have prepared below script to change sql server service account. But the service is not stopping and starting when i run below script. Any diea? is there any beeter way to do this. is there any alternative for Sleep. We don't know how much service takes to stop and start. Is there a way to keep powershell to wait until service completely
stops/starts.
$Services = Get-WmiObject Win32_Service -ComputerName "." | Where { $_.name -eq 'MSSQLSERVER' }
ForEach($Service in $Services)
{
$StopStatus = $Service.StopService()
Sleep 15
If ($StopStatus.ReturnValue -eq "0")
{write-host "$Service -> Service Stopped Successfully"}
$ChangeStatus = $Service.change($null,$null,$null,$null,$null,$null,$ServiceAccount,$Password,$null,$null,$null)
If ($ChangeStatus.ReturnValue -eq "0")
{write-host "$Service -> Sucessfully Changed Service Account"}
$StartStatus = $Service.StartService()
Sleep 25
If ($ChangeStatus.ReturnValue -eq "0")
{write-host "$Service -> Service Started Successfully"}
}
You can see if the service is stopped ( or started ) in a loop and then proceed:
$sleepCounter = 1
While((Get-Service $serviceName).status -ne "Stopped" ){
Write-Host "Waiting for service to stop. Attempt $sleepCounter of 20"
sleep 1
if($sleepCounter -eq 20) { break }
$sleepCounter++
}
Also, you can do the following to get the service instead of using where-object
$svc=gwmi win32_service -filter "name='$serviceName'"
I'm pretty new to Powershell. I have 2 different scripts I'm running that I would like to combine into one script.
Script 1 has 1 line
Stop-Process -ProcessName alcore.* -force
It's purpose is to end any process that begines with "alcore."
Script 2 has 1 line as well
Start-Service -displayname crk*
It starts any service that begins with crk.
How can I combine these into one script? If the processes are running I wish to stop them, if not, I wish to start the services. How can I accomplish this?
I'm trying this but it's not working
$services = Get-Process alcore.*
if($services.Count -qe 1){
Stop-Process -ProcessName alcore.* -force
} else {
Start-Service -displayname crk*
}
How can I do this correctly? Also should I wrap these up in a function and call the function? That seems a bit cleaner. Thanks for any help.
Cheers,
~ck
Use Get-Service to get the service status. The process could be running but the service might be paused:
$services = #(Get-Service alcore.*)
foreach ($service in $services)
{
if ($service.Status -eq 'Running')
{
$service | Stop-Service
}
else
{
Start-Service -DisplayName crk*
}
}