Adding Desired State Configuration extension to a service fabric VMSS - azure-service-fabric

We recently needed to add the Microsoft.Powershell.DSC extension to our VMSS that contain our service fabric cluster. We redeployed the cluster using our ARM template, with the addition of the new extension for DSC. During the deployment we observed that as many as 4 out of 5 scale set instances were in the restarting stage at a given time. The services in our cluster were also unresponsive during that time. The outage was only a few minutes long, but this seems like something that should not happen.
Reliability Level: Silver
Durability Level: Bronze

This is caused by the selected durability level 'bronze'.
The durability tier is used to indicate to the system the privileges
that your VMs have with the underlying Azure infrastructure. In the
primary node type, this privilege allows Service Fabric to pause any
VM level infrastructure request (such as a VM reboot, VM reimage, or
VM migration) that impact the quorum requirements for the system
services and your stateful services. In the non-primary node types,
this privilege allows Service Fabric to pause any VM level
infrastructure requests like VM reboot, VM reimage, VM migration etc.,
that impact the quorum requirements for your stateful services running
in it.
Bronze - No privileges. This is the default and is recommended if you are only > running stateless workloads in your cluster.

I suggest reading this article. Its a MS employee blog. I'll copy out the relevant part:
If you don’t mind all your VMs being rebooted at the same time, you can set upgradePolicy to “Automatic”. Otherwise set it to “Manual” and take care of applying changes to the scale set model to individual VMs yourself. It is fairly easy to script rolling out the update to VMs while maintaining application uptime. See https://learn.microsoft.com/en-us/azure/virtual-machine-scale-sets/virtual-machine-scale-sets-upgrade-scale-set for more details.
If your scale set is in a Service Fabric cluster, certain updates like changing OS version are blocked (currently – that will change in future), and it is recommended that upgradePolicy be set to “Automatic”, as Service Fabric takes care of safely applying model changes (like updated extension settings) while maintaining availability.

Related

Airflow fault tolerance

I have 2 questions:
first, what does it mean that the Kubernetes executor is fault tolerance, in other words, what happens if one worker nodes gets down?
Second question, is it possible that the whole Airflow server gets down? if yes, is there a backup that runs automatically to continue the work?
Note: I have started learning airflow recently.
Thanks in advance
This is a theoretical question that faced me while learning apache airflow, I have read the documentation
but it did not mention how fault tolerance is handled
what does it mean that the Kubernetes executor is fault tolerance?
Airflow scheduler use a Kubernetes API watcher to watch the state of the workers (tasks) on each change in order to discover failed pods. When a worker pod gets down, the scheduler detect this failure and change the state of the failed tasks in the Metadata, then these tasks can be rescheduled and executed based on the retry configurations.
is it possible that the whole Airflow server gets down?
yes it is possible for different reasons, and you have some different solutions/tips for each one:
problem in the Metadata: the most important part in Airflow is the Metadata where it's the central point used to communicate between the different schedulers and workers, and it is used to save the state of all the dag runs and tasks, and to share messages between tasks, and to store variables and connections, so when it gets down, everything will fail:
you can use a managed service (AWS RDS or Aurora, GCP Cloud SQL or Cloud Spanner, ...)
you can deploy it on your K8S cluster but in HA mode (doc for postgresql)
problem with the scheduler: the scheduler is running as a pod, and the is a possibility to lose depending on how you deploy it:
Try to request enough resources (especially memory) to avoid OOM problem
Avoid running it on spot/preemptible VMs
Create multiple replicas (minimum 3) for the scheduler to activate HA mode, in this case if a scheduler gets down, there will be other schedulers up
problem with webserver pod: it doesn't affect your workload, but you will not be able to access the UI/API during the downtime:
Try to request enough resources (especially memory) to avoid OOM problem
It's a stateless service, so you can create multiple replicas without any problem, if one gets down, you will access the UI/API using the other replicas

Does it make sense Service Fabric in a single machine?

Service Fabric looks great but right now, I do not have enough demand to hire 5 machines (I think it is the minimum number of nodes of a cluster).
I was thinking to install Service Fabric SDK on a single Azure Virtual Machine.
I know that I will not have the main benefits of a service fabric application: reliability and scalability, but I will be developing in a framework that I can easily can hire more machines and to scale if it is necessary in the future without changing anything.
Right now, I have 15 microservices and I plan to add 10 more. At the present I am using IIS and deployment and maintenance is not too fast. It seems that Service Fabric could solve it, plus it would be easily scalabe
Does it make sense to use Service Fabric in a single machine? or better to keep under IIS.
Technically it is possible, though it doesn't make much sense. The one node cluster, runs with a special configuration and so, scale out of that cluster is not supported. You can use a single node cluster for testing and then create another one for production use.

What is the Google Cloud Platform's "Managed Infrastructure Mixer Client"?

Can someone tell me what the purpose of the “Managed Infrastructure Mixer Client”? I have it showing up on my GCE logs and I can’t find any information on it. It is adding and removing GCE instances.
I believe it is related to GCP's recommended settings:
Automatic restart - On (recommended)
On host maintenance - Migrate VM instance (recommended)
This is the User Agent used by Managed Instance Groups when performing operations on instances. These operations can result from both user operating on the MIG (e.g. resizing, recreating instances), as well as operations performed by Autoscaler, Autohealer, Updater, etc.
Note that this string may change in the future.

Persistent storage for Apache Mesos

Recently I've discovered such a thing as a Apache Mesos.
It all looks amazingly in all that demos and examples. I could easily imagine how one would run for stateless jobs - that fits to the whole idea naturally.
Bot how to deal with long running jobs that are stateful?
Say, I have a cluster that consists of N machines (and that is scheduled via Marathon). And I want to run a postgresql server there.
That's it - at first I don't even want it to be highly available, but just simply a single job (actually Dockerized) that hosts a postgresql server.
1- How would one organize it? Constraint a server to a particular cluster node? Use some distributed FS?
2- DRBD, MooseFS, GlusterFS, NFS, CephFS, which one of those play well with Mesos and services like postgres? (I'm thinking here on the possibility that Mesos/marathon could relocate the service if goes down)
3- Please tell if my approach is wrong in terms of philosophy (DFS for data servers and some kind of switchover for servers like postgres on the top of Mesos)
Question largely copied from Persistent storage for Apache Mesos, asked by zerkms on Programmers Stack Exchange.
Excellent question. Here are a few upcoming features in Mesos to improve support for stateful services, and corresponding current workarounds.
Persistent volumes (0.23): When launching a task, you can create a volume that exists outside of the task's sandbox and will persist on the node even after the task dies/completes. When the task exits, its resources -- including the persistent volume -- can be offered back to the framework, so that the framework can launch the same task again, launch a recovery task, or launch a new task that consumes the previous task's output as its input.
Current workaround: Persist your state in some known location outside the sandbox, and have your tasks try to recover it manually. Maybe persist it in a distributed filesystem/database, so that it can be accessed from any node.
Disk Isolation (0.22): Enforce disk quota limits on sandboxes as well as persistent volumes. This ensures that your storage-heavy framework won't be able to clog up the disk and prevent other tasks from running.
Current workaround: Monitor disk usage out of band, and run periodic cleanup jobs.
Dynamic Reservations (0.23): Upon launching a task, you can reserve the resources your task uses (including persistent volumes) to guarantee that they are offered back to you upon task exit, instead of going to whichever framework is furthest below its fair share.
Current workaround: Use the slave's --resources flag to statically reserve resources for your framework upon slave startup.
As for your specific use case and questions:
1a) How would one organize it? You could do this with Marathon, perhaps creating a separate Marathon instance for your stateful services, so that you can create static reservations for the 'stateful' role, such that only the stateful Marathon will be guaranteed those resources.
1b) Constraint a server to a particular cluster node? You can do this easily in Marathon, constraining an application to a specific hostname, or any node with a specific attribute value (e.g. NFS_Access=true). See Marathon Constraints. If you only wanted to run your tasks on a specific set of nodes, you would only need to create the static reservations on those nodes. And if you need discoverability of those nodes, you should check out Mesos-DNS and/or Marathon's HAProxy integration.
1c) Use some distributed FS? The data replication provided by many distributed filesystems would guarantee that your data can survive the failure of any single node. Persisting to a DFS would also provide more flexibility in where you can schedule your tasks, although at the cost of the difference in latency between network and local disk. Mesos has built-in support for fetching binaries from HDFS uris, and many customers use HDFS for passing executor binaries, config files, and input data to the slaves where their tasks will run.
2) DRBD, MooseFS, GlusterFS, NFS, CephFS? I've heard of customers using CephFS, HDFS, and MapRFS with Mesos. NFS would seem an easy fit too. It really doesn't matter to Mesos what you use as long as your task knows how to access it from whatever node where it's placed.
Hope that helps!

How can i do zero down time deployment on cluster environment?

I need to deploy a major deployment on my system (more that 15 ear file ) , my system is high available system , So how can I do this deployment with zero downtime ?
my application server is IBM-WAS
After updating the applications, you can utilize the "Rollout Update" feature. Rather than saving and synchronizing the nodes after updating, you can use this feature which automatically performs the following tasks to enable the changes to propagate to all deployment targets while maintaining high availability (assuming you have a horizontal cluster, such that cluster members exist on multiple nodes):
Save session changes to the master configuration
For each node in the cluster (one at a time, to enable continuous availability):
Stop the cluster members on the node
Synchronize the node
Start the application servers (which automatically starts the application)