Troubleshooting ServiceFabric StatefulService Deployment

Troubleshooting ServiceFabric StatefulService Deployment - azure-service-fabric

Working with a stateful service in ServiceFabric version 5.1.163.9590, I am attempting to deploy a demo application with three WebApi services that manage their own state.
Two of the three services start and create their partitions without errors, but the last spews events a series of warnings and errors, the error detail has this intriguing message:
Microsoft.ServiceFabric.Replicator.LoggingReplicator : GetCopyState The parameter copyContext is null. This might be caused by deployment bug that 'hasPersistedState' attribute is false.
I can't locate any external references to this error message.
Is there a way to correct this from the application and service deployment side, or from the cluster management side?

The error indicates you have a stateful service with persisted state, but didn't tell Service Fabric about that when you deployed the service.
There's a flag that needs to be set to indicate to Service Fabric that a stateful service has persisted state (as opposed to state that is "volatile," meaning in-memory only).
In your ServiceManifest.xml, make sure you have this flag set on the service type:
<ServiceTypes>
<StatefulServiceType ServiceTypeName="MyServiceType" HasPersistedState="true" />
</ServiceTypes>
Then if you're deploying through PowerShell, make sure you set this flag when you create an instance of the service:
PS > New-ServiceFabricService -Stateful -HasPersistedState -ServiceTypeName "MyServiceType" ...

Related

Removing service fabric application fails

I have deployed an application to a 5 node standalone cluster. Deployment succeeded successful. But the application did not start because of some bug in the application.
I tried removing the application from the cluster using the Service Fabric Explorer but this fails.
The health State of the application is “Error” and the status is “Deleting”
The application has 9 services. 6 services show a Health State “Unknown” with a question mark and a Status “Unknown”. 3 services show a health state “Ok” but with a Status “Deleting”.
I have also tried to remove it using powershell:
Remove-ServiceFabricApplication -ApplicationName fabric:/appname -Force -ForceRemove
The result was an Operation timed out.
I also tried the script below that I found in some other post.
Connect-ServiceFabricCluster -ConnectionEndpoint localhost:19000
$nodes = Get-ServiceFabricNode
foreach($node in $nodes)
{
$replicas = Get-ServiceFabricDeployedReplica -NodeName $node.NodeName - ApplicationName "fabric:/MyApp"
foreach ($replica in $replicas)
{
Remove-ServiceFabricReplica -ForceRemove -NodeName $node.NodeName -PartitionId $replica.Partitionid -ReplicaOrInstanceId $replica.ReplicaOrInstanceId
}
}
Also no result, the script did not find any replica to remove.
At the same time we started removing the application one of the system services also changed state.
The fabric:/System/NamingService service shows a “Warning” health state.
This is on partition 00000000-0000-0000-0000-000000001002.
The primary replica shows:
Unhealthy event: SourceId='System.NamingService', Property='Duration_PrimaryRecovery', HealthState='Warning', ConsiderWarningAsError=false.
The PrimaryRecovery started at 2016-10-06 07:55:21.252 is taking longer than 30:00.000.
I also restarted every node (1 at the time) with no result.
How to force to remove the application without recreating the cluster because that is not a option for a production environment.

Yeah this can happen if you don't allow your code to exit RunAsync or Open/Close of your ICommunicationListener.
Some background:
Your service has a lifecycle that is driven by Service Fabric. A small component in your service - you know it as FabricRuntime - drives this. For stateless service instances, it's a simple open/close lifecycle. For stateful services, it's a bit more complex. A stateful service replica opens and closes, but also changes role, between primary, secondary, and none. Lifecycle changes are initiated by Service Fabric and show up as a method call or cancellation token trigger in your code. For example, when a replica is switch to primary, we call your RunAsync method. When it switches from primary to something else, or needs to shut down, the cancellation token is triggered. Either way, the system waits for you to finish your work.
When you go delete a service, we tell your service to change role and close. If your code doesn't respond, then it will get stuck in that state.
To get out of that state, you can run Remove-ServiceFabricReplica -ForceRemove. This essentially drops the replica from the system - as far Service Fabric is concerned, the replica is gone. But your process is still running. So you have to go in and kill the process too.

The error in the script is with the '- ApplicationName' and should be '-ApplicationName'.
And after correcting the parameter, this DID remove the hosed up pieces and get me back in order to be able to correct and redeploy the application to the cluster.

Service Fabric: removed actors and now upgrade fails

I'm trying to upgrade a Service Fabric application with a mix of stateful and stateless actors. I did some refactoring and so removed some actors I didn't need any more. Now, when I try to upgrade the application, I get the following error:
Services must be explicitly deleted before removing their Service Types.
After thinking about it a little bit, I think I understand the trouble that could come from removed services and upgrades, but then what's the correct way to do this?

You need to remove the service instances before you can upgrade to a version that doesn't contain the removed service package. Either:
In SF Explorer, navigate to the service and click Actions > Delete Service
In PowerShell:
Connect-ServiceFabricCluster
Remove-ServiceFabricService -ServiceName fabric:/MyApp/MyService
DO BE CAREFUL - If you're deleting a stateful service you'll lose all its data. Always be sure to have a periodic backup of production data.

How to configure startup order of stateless services?

Is it possible to configure startup order when starting up the services.
A Service1 has to be running before Service2 can be started.
Clarification:
I'm didn't mean micro services when I mentioned Service, I meant stateless services like REST API (Service1) and WebSocket (Service2).
So when then solution is deployed the WebSocket service (Service2) must be up and running before the REST API (Service1)?

Of course you can, because you control when services are created. It's not immediately obvious if you've only ever deployed applications through Visual Studio, because Visual Studio sets you up with Default Services. This is what you see in ApplicationManifest.xml when you create an application through Visual Studio:
<DefaultServices>
<Service Name="Stateless1">
<StatelessService ServiceTypeName="Stateless1Type" InstanceCount="[Stateless1_InstanceCount]">
<SingletonPartition />
</StatelessService>
</Service>
<Service Name="Stateful1">
<StatefulService ServiceTypeName="Stateful1Type" TargetReplicaSetSize="[Stateful1_TargetReplicaSetSize]" MinReplicaSetSize="[Stateful1_MinReplicaSetSize]">
<UniformInt64Partition PartitionCount="[Stateful1_PartitionCount]" LowKey="-9223372036854775808" HighKey="9223372036854775807" />
</StatefulService>
</Service>
</DefaultServices>
This is a nice convenience when you know you always want certain services created a certain way each time you create an application instance. You can define them declaratively here and Service Fabric will create them whenever you create an instance of the application.
But it has some drawbacks. Most notably, in your case, is that you have no control over the order in which the services are created.
It also hides some of the concepts around application and service types and application and service instances, which again can be convenient until you want to do something more advanced, like in your case.
When you "deploy" an application, there are actually several steps:
Create the application package
Copy the package up to the cluster
Register the application type and version
Create an instance of the registered application type and version
Create instances of each registered service type in that application
With Default Services, you skip step 5 because Service Fabric does it for you. Without Default Services though, you get to create your service instances yourself, so you can determine what order to do it in. You can do other things like check if a service is ready before creating the next one. All of these actions are available in Service Fabric's C# SDK and PowerShell cmdlets. Here's a quick PowerShell example:
Copy-ServiceFabricApplicationPackage -ApplicationPackagePath C:\temp\MyApp -ImageStoreConnectionString fabric:ImageStore -ApplicationPackagePathInImageStore MyApp
Register-ServiceFabricApplicationType MyApp
New-ServiceFabricApplication -ApplicationName fabric:/MyAppInstance -ApplicationTypeName MyApp -ApplicationTypeVersion 1.0
New-ServiceFabricService -ApplicationName fabric:/MyAppInstance -InstanceCount 1 -PartitionSchemeSingleton -ServiceName fabric:/MyAppInstance/MyStatelessService -ServiceTypeName MyStatelessService -Stateless
New-ServiceFabricService -ApplicationName fabric:/MyAppInstance -MinReplicaSetSize 2 -PartitionSchemeSingleton -ServiceName fabric:/MyAppInstance/MyStatefulService -ServiceTypeName MyStatefulServiceType -Stateful -TargetReplicaSetSize 3
Of course, this just applies to creating the service instances. When it comes time to upgrading your services, the "upgrade unit" is actually the application, so you can't pick the order in which services within an application get upgraded, at least not in one single upgrade. You can, however, choose which services get upgraded during an application upgrade, so if you have the same ordering dependency, you can accomplish that by doing two separate application upgrades.
So you get some level of control. But, it's really best if your services are resilient to missing dependent services, because there will likely be times when a service is unavailable for one reason or another.
Edit:
I showed you a lot of PowerShell but I should mention the C# APIs are also just as powerful. You have the full set of management tools at your disposal in C#. For example, you can have a service that creates and manages other services. In your case, if Service A depends on Service B, then you can have Service A create an instance of Service B before Service A itself does any work, and throughout Service A's life it can keep an eye on Service B. Here's an example of a service that creates other applications and services: https://github.com/Azure-Samples/service-fabric-dotnet-management-party-cluster/blob/master/src/PartyCluster.ApplicationDeployService/FabricClientApplicationOperator.cs

In service fabric world it's called Service Affinity (https://azure.microsoft.com/en-us/documentation/articles/service-fabric-cluster-resource-manager-advanced-placement-rules-affinity/)
Anyway, try to avoid these situations in microservices world.
Good luck.

Why isn't it possible to change placement constraints in an upgrade?

I have a stateless ASP.NET Core (RC1) service running in my Azure Service Fabric cluster. It has the following manifest:
<ServiceManifest Name="MyServicePkg" Version="1.0.2" ...>
<ServiceTypes>
<StatelessServiceType ServiceTypeName="MyServiceType" />
</ServiceTypes>
...
</ServiceManifest>
My cluster is configured with placement properties. I have 5 servers with "nodeType=Backend" and 3 servers with "nodeType=Frontend".
I would like to upgrade my Service and specify that it may only be placed on "Backend" nodes. This is my updated manifest:
<ServiceManifest Name="MyServicePkg" Version="1.0.3" ...>
<ServiceTypes>
<StatelessServiceType ServiceTypeName="MyServiceType">
<PlacementConstraints>(nodeType==Backend)</PlacementConstraints>
</StatelessServiceType>
</ServiceTypes>
...
</ServiceManifest>
However, if I now execute the upgrade, I get the following error:
Start-ServiceFabricApplicationUpgrade : Default service descriptions
must not be modified as part of upgrade. Modified default service:
fabric:/MyApp/MyService
Why isn't it possible to change the constraints with an upgrade?
Would I have to delete and re-create the service? This would seem extremely problematic to me because it would result in downtime and data loss for stateful services.

So the issue here is actually with the DefaultService part of the ApplicationManifest. When services are created as part of the DefaultService, there are things you can't change about it afterwards. You might be able to change it through the ServiceFabric explorer, but I'm not sure.
One recommendation would be to keep the DefaultServices empty in the ApplicationManifest, and instead create your services manually. With manual I mean either through powershell, code or the ServiceFabric Explorer.
That gives you more flexibility about changing parts of the service afterwards. When it's done that way, you I know you have the possibility to change things like placement constraints after the service is running.
To create Services with PowerShell you can use the New-ServiceFabricService command.
To create it from code, you can use FabricClient to do it. A sample of that can be found here: Azure Service Fabric Multi-Tenancy

There's actually a fairly easy way to do this without having to write a bunch of code to manually define the application on the fabric cluster.
While you can declare the placement constraints in the service manifest, you can also declare them in the application manifest. Anything declared in the application manifest will override what's in the service manifest. And with the setting in the application manifest, you can then use parameters to alter the values based on the parameter file you want to a specific deployment.
I've just written up a blog post that discusses this approach in greater detail. I hope you find it useful. :)

Rolling Deployment of Java application in Weblogic

I am trying to understand if rolling deployment of application is possible in Weblogic. Weblogic version is 12.1.2.0.0.
"By rolling deployment I mean, deploying the new version to a single node or a child cluster, by removing the node or child cluster from targets of existing deployment. This is to make sure that the current version of deployment on existing cluster is still functioning, probably with degraded performance, due to removing a node/a child cluster.
The operation team can verify if the intended change has worked." Once verified then the target for the deployment can be updated to add rest of the child cluster(s).
I am aware of the -redeploy option available in Weblogic, which mean no outage, but it does the deployment to the same target as the original deployment.
java weblogic.Deployer -adminurl http://localhost:8802
-username weblogic -password weblogic -name VersionedApp
-targets adminServer -redeploy -source
C:/tmp/VersionedApp2 -appversion version2
However not sure how will it behave, if there is an active DB in the backend.
Any insight on this is highly appreciated.

You should look at -adminmode atribute for deploying. In Oracle Docs: http://docs.oracle.com/middleware/1213/wls/DEPGD/wldeployer.htm#DEPGD318
You need first enable admin port, and than application which is deployed in adminmode can be only accessed only by adminport (context is visible at admin port not at production one).
Once tests are ok, you can promote application from "admin" state to "active" one by using "-start" parameter in weblogic.Deployer.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse