Stateful service secondary replicas RunAsync - azure-service-fabric

For a stateful service, does it run/execute the process in RunAsync on the secondary replicas?

RunAsync only runs on primaries of your partitions for Stateful services.
https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-reliable-services-advanced-usage#stateful-service-replica-lifecycle
The RunAsync method in a stateful service is executed only when the stateful service replica is primary. The RunAsync method is cancelled when a primary replica's role changes away from primary, as well as during the close and abort events.
For stateless it runs "when the instance is about to be used". https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-reliable-services-advanced-usage#stateless-service-instance-lifecycle
So for Stateful you can have one RunAsync per partition running (on the primaries), and for stateless one per instance.

Related

Airflow fault tolerance

I have 2 questions:
first, what does it mean that the Kubernetes executor is fault tolerance, in other words, what happens if one worker nodes gets down?
Second question, is it possible that the whole Airflow server gets down? if yes, is there a backup that runs automatically to continue the work?
Note: I have started learning airflow recently.
Thanks in advance
This is a theoretical question that faced me while learning apache airflow, I have read the documentation
but it did not mention how fault tolerance is handled
what does it mean that the Kubernetes executor is fault tolerance?
Airflow scheduler use a Kubernetes API watcher to watch the state of the workers (tasks) on each change in order to discover failed pods. When a worker pod gets down, the scheduler detect this failure and change the state of the failed tasks in the Metadata, then these tasks can be rescheduled and executed based on the retry configurations.
is it possible that the whole Airflow server gets down?
yes it is possible for different reasons, and you have some different solutions/tips for each one:
problem in the Metadata: the most important part in Airflow is the Metadata where it's the central point used to communicate between the different schedulers and workers, and it is used to save the state of all the dag runs and tasks, and to share messages between tasks, and to store variables and connections, so when it gets down, everything will fail:
you can use a managed service (AWS RDS or Aurora, GCP Cloud SQL or Cloud Spanner, ...)
you can deploy it on your K8S cluster but in HA mode (doc for postgresql)
problem with the scheduler: the scheduler is running as a pod, and the is a possibility to lose depending on how you deploy it:
Try to request enough resources (especially memory) to avoid OOM problem
Avoid running it on spot/preemptible VMs
Create multiple replicas (minimum 3) for the scheduler to activate HA mode, in this case if a scheduler gets down, there will be other schedulers up
problem with webserver pod: it doesn't affect your workload, but you will not be able to access the UI/API during the downtime:
Try to request enough resources (especially memory) to avoid OOM problem
It's a stateless service, so you can create multiple replicas without any problem, if one gets down, you will access the UI/API using the other replicas

Stateless Worker service in Service Fabric restarted in the same process

I have a stateless service that pulls messages from an Azure queue and processes them. The service also starts some threads in charge of cleanup operations. We recently ran into an issue where these threads which ideally should have been killed when the service shuts down continue to remain active (definitely a bug in our service shutdown process).
Further looking at logs, it seemed that, the RunAsync methods cancellation token received a cancellation request, and later within the same process a new instance of the stateless service that was registered in ServiceRuntime.RegisterServiceAsync was created.
Is this expected behavior that service fabric can re-use the same process to start a new instance of the stateless service after shutting down the current instance.
The service fabric documentation https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-hosting-model does seem to suggest that different services can be hosted on the same process, but I could not find the above behavior being mentioned there.
In the shared process model, there's one process per ServicePackage instance on every node. Adding or replacing services will reuse the existing process.
This is done to save some (process-level) overhead on the node that runs the service, so hosting is more cost-efficient. Also, it enables port sharing for services.
You can change this (default) behavior, by configuring the 'exclusive process' mode in the application manifest, so every replica/instance will run in a separate process.
<Service Name="VotingWeb" ServicePackageActivationMode="ExclusiveProcess">
As you mentioned, you can monitor the CancellationToken to abort the separate worker threads, so they can stop when the service stops.
More info about the app model here and configuring process activation mode.

Which kubernetes mode to chose

I have a situation where each message in a message queue has to be processed by a separate instance (one pod can process one message at a time). Many messages can be processed at once, but there is a limit of parallel executions. Once it's reached, no new messages are being pulled from the queue. Message processing takes about 30 minutes. No state needs to be stored on the pods between calls (all data is read from a database when pod starts processing a message). A new message should spawn a new pod, once the processing finishes, the pod should die.
Should I use Deployments, ReplicaSets, StatefulSets, Services? (we use Kubernetes with Azure) I guess the main
I've tried ReplicaSets, but in a situation when three messages are being processed and one finishes, scaling down a ReplicaSet can kill a working pod, which is definately not what I need.
I would say that since you do not need to handle state you must discard the StatefulSets, on the other hand, a Deployment is a higher-level concept of ReplicaSets, so, you should use a Deployment as it takes care of the replica set. Lastly, as your processing is under demand I would consider using Jobs, once the job completes the task it frees the resources and dies, this would require extra code to create the jobs based on a helper but could be very handy.

Is there downtime when a partition is moved to a new node?

Service Fabric offers the capability to rebalance partitions whenever a node is removed or added to the cluster. The Service Fabric Cluster Resource Manager will move one or more partitions to this node so more work can be done.
Imagine a reliable actor service which has thousands of actors running who are distributed across multiple partitions. If the Resource Manager decides to move one or more partitions, will this cause any downtime? Or does rebalancing partitions work the same as upgrading a service?
They act pretty much the same way, The main difference I can point is that Upgrades might affect only the services being updated, and re-balancing might affect multiple services at once. During an upgrade, the cluster might re-balance the services as well to fit the new service instance in a node.
Adding or Removing nodes I would compare more with node failures. In any of these cases they will be rebalanced because of the cluster capacity changes, not because of the service metric\load changes.
The main difference between a node failure and a cluster scaling(Add/remove node) is that the rebalance will take in account the services states during the process, when a infrastructure notification comes in telling that a node is being shutdown(for updates or maintenance, or scaling down) the SF will ask the Infrastructure to wait so it can prepare for this announced 'failure', and then start re-balancing the services.
Even though re-balancing cares about the service states for a scale down it should not be considered more reliable than a node failure, because the infrastructure will wait for a while before shutting down the node(the limit it can wait will depend on the reliability tier you defined for your cluster), until SF check if the services meet health conditions, like turn down services and creating new ones, checking if they will run fine without errors, if this process takes too long, these service might be killed once the timeout is reached and the infrastructure proceed with the changes, Also, the new instances of the services might fail on new nodes, forcing the services to move again.
When you design the services is safer to consider the re-balancing as a node failure, because at the end is not much different. Your services will move around, data stored in memory will be lost if not persisted, the service address will change, and etc. The services should have replicated data and the clients should always use a retry logic and refresh the services location to reduce the down time.
The main difference between service upgrade and service rebalancing is that during upgrade all replicas from all partitions are get turned off on particular node. According to documentation here balancing is done on replica basis i.e. only some replicas from some partitions will get moved, so there shouldn't be any outage.

Removing service fabric application fails

I have deployed an application to a 5 node standalone cluster. Deployment succeeded successful. But the application did not start because of some bug in the application.
I tried removing the application from the cluster using the Service Fabric Explorer but this fails.
The health State of the application is “Error” and the status is “Deleting”
The application has 9 services. 6 services show a Health State “Unknown” with a question mark and a Status “Unknown”. 3 services show a health state “Ok” but with a Status “Deleting”.
I have also tried to remove it using powershell:
Remove-ServiceFabricApplication -ApplicationName fabric:/appname -Force -ForceRemove
The result was an Operation timed out.
I also tried the script below that I found in some other post.
Connect-ServiceFabricCluster -ConnectionEndpoint localhost:19000
$nodes = Get-ServiceFabricNode
foreach($node in $nodes)
{
$replicas = Get-ServiceFabricDeployedReplica -NodeName $node.NodeName - ApplicationName "fabric:/MyApp"
foreach ($replica in $replicas)
{
Remove-ServiceFabricReplica -ForceRemove -NodeName $node.NodeName -PartitionId $replica.Partitionid -ReplicaOrInstanceId $replica.ReplicaOrInstanceId
}
}
Also no result, the script did not find any replica to remove.
At the same time we started removing the application one of the system services also changed state.
The fabric:/System/NamingService service shows a “Warning” health state.
This is on partition 00000000-0000-0000-0000-000000001002.
The primary replica shows:
Unhealthy event: SourceId='System.NamingService', Property='Duration_PrimaryRecovery', HealthState='Warning', ConsiderWarningAsError=false.
The PrimaryRecovery started at 2016-10-06 07:55:21.252 is taking longer than 30:00.000.
I also restarted every node (1 at the time) with no result.
How to force to remove the application without recreating the cluster because that is not a option for a production environment.
Yeah this can happen if you don't allow your code to exit RunAsync or Open/Close of your ICommunicationListener.
Some background:
Your service has a lifecycle that is driven by Service Fabric. A small component in your service - you know it as FabricRuntime - drives this. For stateless service instances, it's a simple open/close lifecycle. For stateful services, it's a bit more complex. A stateful service replica opens and closes, but also changes role, between primary, secondary, and none. Lifecycle changes are initiated by Service Fabric and show up as a method call or cancellation token trigger in your code. For example, when a replica is switch to primary, we call your RunAsync method. When it switches from primary to something else, or needs to shut down, the cancellation token is triggered. Either way, the system waits for you to finish your work.
When you go delete a service, we tell your service to change role and close. If your code doesn't respond, then it will get stuck in that state.
To get out of that state, you can run Remove-ServiceFabricReplica -ForceRemove. This essentially drops the replica from the system - as far Service Fabric is concerned, the replica is gone. But your process is still running. So you have to go in and kill the process too.
The error in the script is with the '- ApplicationName' and should be '-ApplicationName'.
And after correcting the parameter, this DID remove the hosed up pieces and get me back in order to be able to correct and redeploy the application to the cluster.