Kubernetes : Cluster-Autoscaler: How to verify autoscaling is working - kubernetes

I am working on our EKS platform, where I have installed Cluster Autoscaler. I can see it running in Kube Dashboard. Yesterday for Load Testing, I triggered 20 replicas of a heavy app we have. The cpu usage per node climbed to 100%, but cluster auto-scaler didn't trigger any additional nodes. I was watching the logs and the logs kept on rotating in main loop, but no changes.
Here are the tags I have added to ASG, worker nodes :
k8s.io/cluster-autoscaler/enabled : true
kubernetes.io/cluster/CLUSTER_NAME : owned
I can see the pod running in Dashboard :
./cluster-autoscaler
--v=4
--stderrthreshold=info
--cloud-provider=aws
--skip-nodes-with-local-storage=false
--expander=least-waste
--node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/cluster_name
Also, There are no scaling policies added in ASG. Are they required for Cluster Autoscaler? How to verify cluster autoscaler is working properly? What am I missing?

Actually cluster autoscaler checks for any unschedulable pods every 10 seconds and if any pods available in unschedulable state then it will check min and max of autoscaling group. You can check this wonderful FAQ how-does-scale-up-work of autoscaler. If it is not reached max then it will request to aws autoscaling group to add one more.
Now the answer of your question is, you can check or verify autoscaling easily by noticing whether you have any unscheduled pods in your cluster or not. If there is any then autoscaler will try to add one more node which will be reflected in autoscaler log if it is not reached in max limit.
For more details you can check this FAQ. You can check also vertical pods scaler to get vertical pods scaling from here

You can tail the logs and see the events.
kubectl logs -f deployment/cluster-autoscaler -n kube-system --tail=10
It will show the scaling events.

Related

How long does it take for Kubernetes to detect and delete excess nodes

I am running a Kubernetes cluster in AWS EKS and I set up the autoscaler. I tested the autoscaler and it worked as when the number of pods in a node exceeded 110 then new nodes were automatically added to the cluster and the pending pods entered running state.
After that, I deleted the deployment. It's been about 10 minutes and I see that all new nodes created by the autoscaler are already there and in ready state!
How long does it take for Kubernetes to delete them automatically? Does it down-scale the cluster automatically at all?
Although scaling down is a slow process the default scan interval is 10 seconds if you are using the autoscaler to scale the nodes in EKS.
You can check the status of autoscaler using configmap and its a decision.
There could be a possibility that on the new node you have some system pod running so due to that EKS is not able to scale those nodes down or PDB(PodDisruptionBudget) is set for deployments.
Pod has the annotation "cluster-autoscaler.kubernetes.io/safe-to-evict": "false"
Read more about EKS scaling : https://docs.aws.amazon.com/eks/latest/userguide/autoscaling.html

How to implement horizontal auto scaling in GKE autopilot based on a custom metric

I'm running a Kubernetes cluster on GKE autopilot
I have pods that do the following - Wait for a job, run the job (This can take minutes or hours), Then go to Pod Succeeded State which will cause Kubernetes to restart the pod.
The number of pods I need is variable depending on how many users are on the platform. Each user can request a job that needs a pod to run.
I don't want users to have to wait for pods to scale up so I want to keep a number of extra pods ready and waiting to execute.
The application my pods are running can be in 3 states - { waiting for job, running job, completed job}
Scaling up is fine as I can just use the scale API and always request to have a certain percentage of pods in waiting for job state
When scaling down I want to ensure that Kubernetes doesn't kill any pods that are in the running job state.
Should I implement a Custom Horizontal Pod Autoscaler?
Can I configure custom probes for my pod's application state?
I could use also use pod priority or a preStop hook
You can configure horizontal Pod autoscaling to ensure that Kubernetes doesn't kill any pods.
Steps for configuring horizontal pod scaling:
Create the Deployment, apply the nginx.yaml manifest,Run the following command:
kubectl apply -f nginx.yaml
Autoscaling based on resources utilization
1-Go to the Workloads page in Cloud Console.
2-Click the name of the nginx Deployment.
3-Click list Actions > Autoscale.
4-Specify the following values:
-Minimum number of replicas: 1
-Maximum number of replicas: 10
-Auto Scaling metric: CPU
-Target: 50
-Unit: %
5-Click Done.
6-Click Autoscale.
To get a list of Horizontal Pod Autoscalers in the cluster, use the following command:
kubectl get hpa
Guide on how to Configure horizontal pod autoscaling.
You can also refer to this link of auto-scaling rules for the GKE autopilot cluster using a custom metric on the Cloud Console.

GKE node pool with Autoscaling does not scale down

I have a GKE cluster with two nodepools. I turned on autoscaling on one of my nodepools but it does not seem to automatically scale down.
I have enabled HPA and that works fine. It scales the pods down to 1 when I don't see traffic.
The API is currently not getting any traffic so I would expect the nodes to scale down as well.
But it still runs the maximum 5 nodes despite some nodes using less than 50% of allocatable memory/CPU.
What did I miss here? I am planning to move these pods to bigger machines but to do that I need the node autoscaling to work to control the monthly cost.
There are many reasons that can cause CA to not be downscaling successfully. If we resume how this should work normally it will be something like this:
Cluster autoscaler will periodically check (every 10 seconds) utilization of the nodes.
If the utilization factor is less than 0.5 the node will be considered as under utilization.
Then the nodes will be marked for removal and will be monitored for next 10 mins to make sure the utilization factor stays less than 0.5.
If even after 10 mins it stays under utilized then the node would be removed by cluster autoscaler.
If above is not being accomplished, then something else is preventing your nodes to be downscaling. In my experience PDBs needs to be applied to kube-system pods and I would say that could be the reason why; however, there are many reasons why this can be happening, here are reasons that can cause downscaling issues:
1. PDB is not applied to your kube-system pods. Kube-system pods prevent Cluster Autoscaler from removing nodes on which they are running. You can manually add Pod Disruption Budget(PDBs) for the kube-system pods that can be safely rescheduled elsewhere, this can be added with next command:
`kubectl create poddisruptionbudget PDB-NAME --namespace=kube-system --selector app=APP-NAME --max-unavailable 1`
2. Containers using local storage (volumes), even empty volumes. Kubernetes prevents scale down events on nodes with pods using local storage. Look for this kind of configuration that prevents Cluster Autoscaler to scale down nodes.
3. Pods annotated with cluster-autoscaler.kubernetes.io/safe-to-evict: true. Look for pods with this annotation that can be preventing Nodes scaledown
4. Nodes annotated with cluster-autoscaler.kubernetes.io/scale-down-disabled: true. Look for Nodes with this annotation that can be preventing cluster Autoscale. These configurations are the ones I will suggest you check on, in order to make your cluster to be scaling down nodes that are under utilized. -----
Also you can see this page where explains the configuration to prevent the downscales, which can be what is happening to you.

Azure Kubernetes Service - can the Cluster Autoscaler get triggered even if I don't set autoscaling explicitly?

I am deploying a service to Azure Kubernetes Service.
The Horizontal Pod Autoscaler scales the number of pods, whereas the Cluster Autoscaler scales the number of nodes based on the number of pending pods. If my understanding is correct, if I don't set up autoscaling in my deployment file, the HPA won't get triggered, and only one pod will run; therefore, the CA won't get triggered either.
My question is - is there a scenario in AKS where the CA would get triggered, even without setting autoscaling in my deployment file?
My question is - is there a scenario in AKS where the CA would get triggered, even without setting autoscaling in my deployment file?
Cluster autoscaler is typically used together with the horizontal pod autoscaler. The Horizontal Pod Autoscaler increases or decreases the number of pods based on application demand, and the cluster autoscaler adjusts the number of nodes as needed to run those additional pods accordingly.
If your deployment does not have the capacity to automatically scale up or down via the HPA, NOR you don't manually increase number of pods to the level where no additional pods can run due to insufficient resource in your nodes then the CA would not be triggered therefore the answer is NO.
You might find this document from official azure docs helpful also.

Resizing a google cloud Kubernetes cluster to zero not working

I try to resize a kubernetes cluster to zero nodes using
gcloud container clusters resize $CLUSTER_NAME --size=0 --zone $ZONE
I get a success message but the size of the node-pool remains the same (I use only one node pool)
Is it possible to resize the cluster to zero?
Sometimes you just need to wait 10-20 minutes before autoscale operation takes effect.
In other cases, you may need to check if some conditions are met for downscaling the node.
According to autoscaler documentation:
Cluster autoscaler also measures the usage of each node against the node pool's total demand for capacity. If a node has had no new Pods scheduled on it for a set period of time, and all Pods running on that node can be scheduled onto other nodes in the pool, the autoscaler moves the Pods and deletes the node.
Note that cluster autoscaler works based on Pod resource requests, that is, how many resources your Pods have requested. Cluster autoscaler does not take into account the resources your Pods are actively using. Essentially, cluster autoscaler trusts that the Pod resource requests you've provided are accurate and schedules Pods on nodes based on that assumption.
Note: Beginning with Kubernetes version 1.7, you can specify a minimum size of zero for your node pool. This allows your node pool to scale down completely if the instances within aren't required to run your workloads. However, while a node pool can scale to a zero size, the overall cluster size does not scale down to zero nodes (as at least one node is always required to run system Pods)
Cluster autoscaler has following limitations:
- When scaling down, cluster autoscaler supports a graceful termination period for a Pod of up to 10 minutes. A Pod is always killed after a maximum of 10 minutes, even if the Pod is configured with a higher grace period.
Note: Every change you make to the cluster autoscaler causes the Kubernetes master to restart, which takes several minutes to complete.
However, there are cases mentioned in FAQ that can prevent CA from removing a node:
What types of pods can prevent CA from removing a node?
Pods with restrictive PodDisruptionBudget.
Kube-system pods that:
are not run on the node by default, *
don't have PDB or their PDB is too restrictive (since CA 0.6).
Pods that are not backed by a controller object (so not created by deployment, replica set, job, stateful set etc). *
Pods with local storage. *
Pods that cannot be moved elsewhere due to various constraints (lack of resources, non-matching node selectors or affinity, matching anti-affinity, etc)
*Unless the pod has the following annotation (supported in CA 1.0.3 or later):
"cluster-autoscaler.kubernetes.io/safe-to-evict": "true"
How can I scale my cluster to just 1 node?
Prior to version 0.6, Cluster Autoscaler was not touching nodes that were running important kube-system pods like DNS, Heapster, > Dashboard etc. If these pods landed on different nodes, CA could not scale the cluster down and the user could end up with a completely empty 3 node cluster. In 0.6, we added an option to tell CA that some system pods can be moved around. If the user configures a PodDisruptionBudget for the kube-system pod, then the default strategy of not touching the node running this pod is overridden with PDB settings. So, to enable kube-system pods migration, one should set minAvailable to 0 (or <= N if there are N+1 pod replicas.) See also I have a couple of nodes with low utilization, but they are not scaled down. Why?
How can I scale a node group to 0?
From CA 0.6 for GCE/GKE and CA 0.6.1 for AWS, it is possible to scale a node group to 0 (and obviously from 0), assuming that all scale-down conditions are met.
For AWS, if you are using nodeSelector, you need to tag the ASG with a node-template key "k8s.io/cluster-autoscaler/node-template/label/".
For example, for a node label of foo=bar, you would tag the ASG with:
{
"ResourceType": "auto-scaling-group",
"ResourceId": "foo.example.com",
"PropagateAtLaunch": true,
"Value": "bar",
"Key": "k8s.io/cluster-autoscaler/node-template/label/foo"
}