when I change the replicas: x in my .yaml file I can see GKE autopilot boots pods up/down depending on the value, but what will happen if the load on my deployment gets too big. Will it then autoscale the number of pods and nodes to handle the traffic and then reduce back to the value specified in replicas when the request load is reduced again?
I'm basically asking how does autopilot horizontal autoscaling works?
and how do I get a minimum of 2 pod replicas that can horizontally autoscale in autopilot?
GKE autopilot by default will not scale the replicas count beyond what you specified. This is the default behavior of Kubernetes in general.
If you want automatic autoscaling you have to use Horizental Pod Autoscaler (HPA) which is supported in Autopilot
If you deploy HPA to scale up and down your workload, Autopilot will scale up and down the nodes automatically and that's transparent for you as the nodes are managed by Google.
GKE autoscale only Nodes by default, while you have to take care of your HPA deployment scaling.
Autopilot: GKE provisions and manages the cluster's underlying
infrastructure, including nodes and node pools, giving you an
optimized cluster with a hands-off experience.
We need to configure both scaling options for deployment VPA and HPA.
Pre-configured: Autopilot handles all the scaling and configuring of
your nodes.
Default: You configure Horizontal pod autoscaling (HPA) You configure
Vertical Pod autoscaling (VPA)
GKE will manage the scaling up/down of your nodes in node pools, without worrying about the infrastructure you just have to start deploying the application with HPA & VPA auto-scaling.
You can read more about the options here : https://cloud.google.com/kubernetes-engine/docs/concepts/autopilot-overview#comparison
Related
I wanted to know that does Knative on EKS supports the node autoscaler if yes do we need to set up cluster autoscaler or Knative itself scale the nodes?
I tried autoscaling the nodes by increasing the pods but did not work my question is that does Knative automatically scales nodes or do we have to set up any external plugins?
Knative supports only two types of auto-scaling
KPA - Knative Pod Autoscaler
HPA - Horizontal Pod Autoscaler
So for Node auto scaling Cluster Auto scaler will be required to be installed into EKS.
Ref : https://knative.dev/docs/serving/autoscaling/autoscaler-types/
I have to turn off my service in production and turn it on again after a small period (doing a DB migration).
I know I can use kubectl scale deployment mydeployment --replicas=0. This services uses a HorizontalPodAutoscaler (HPA) so how would I go about reseting it to scale according to the HPA?
Thanks in advance :)
As suggested by the # Gari Singh ,HPA will not scale from 0, so once you are ready to reactivate your deployment, just run kubectl scale deployment mydeployment --replicas=1 and HPA will then takeover again.
In Kubernetes, a HorizontalPodAutoscaler automatically updates a workload resource (such as a Deployment or StatefulSet), with the aim of automatically scaling the workload to match demand.
Horizontal scaling means that the response to increased load is to deploy more Pods. This is different from vertical scaling, which for Kubernetes would mean assigning more resources (for example: memory or CPU) to the Pods that are already running for the workload.
If the load decreases, and the number of Pods is above the configured minimum, the HorizontalPodAutoscaler instructs the workload resource (the Deployment, StatefulSet, or other similar resource) to scale back down.
Refer to this link on Horizontal Pod Autoscaling for detailed more information
I am deploying a service to Azure Kubernetes Service.
The Horizontal Pod Autoscaler scales the number of pods, whereas the Cluster Autoscaler scales the number of nodes based on the number of pending pods. If my understanding is correct, if I don't set up autoscaling in my deployment file, the HPA won't get triggered, and only one pod will run; therefore, the CA won't get triggered either.
My question is - is there a scenario in AKS where the CA would get triggered, even without setting autoscaling in my deployment file?
My question is - is there a scenario in AKS where the CA would get triggered, even without setting autoscaling in my deployment file?
Cluster autoscaler is typically used together with the horizontal pod autoscaler. The Horizontal Pod Autoscaler increases or decreases the number of pods based on application demand, and the cluster autoscaler adjusts the number of nodes as needed to run those additional pods accordingly.
If your deployment does not have the capacity to automatically scale up or down via the HPA, NOR you don't manually increase number of pods to the level where no additional pods can run due to insufficient resource in your nodes then the CA would not be triggered therefore the answer is NO.
You might find this document from official azure docs helpful also.
In my 1 node AKS, I deploy multiple job resources (kind:jobs) that are terminated after the task is completed. I have enabled Cluster Autoscaler to add a second node when too many jobs are consuming the first node memory, however it scales out after a job/pod is unable to be created due to lack of memory.
In my job yaml I also defined the resource memory limit and request.
Is there a possibility to configure the Cluster Autoscaler to scale out proactively when it reaches a certain memory threshold (e.g., 70% of the node memory) not just when it cannot deploy a job/pod?
In Kubernetes you can find 3 Autoscaling Mechanisms: Horizontal Pod Autoscaler, Vertical Pod Autoscaler which both can be controlled by metrics usage and Cluster Autoscaler.
As per Cluster Autoscaler Documentation:
Cluster Autoscaler is a tool that automatically adjusts the size of the Kubernetes cluster when one of the following conditions is true:
there are pods that failed to run in the cluster due to insufficient resources.
there are nodes in the cluster that have been underutilized for an extended period of time and their pods can be placed on other existing nodes.
In AKS Cluster Autoscaler Documentation you can find note that CA is Kubernetes Component, not something AKS specific:
The cluster autoscaler is a Kubernetes component. Although the AKS cluster uses a virtual machine scale set for the nodes, don't manually enable or edit settings for scale set autoscale in the Azure portal or using the Azure CLI. Let the Kubernetes cluster autoscaler manage the required scale settings.
In Azure Documentation - About the cluster autoscaler you have information that AKS clusters can scale in one of two ways:
The cluster autoscaler watches for pods that can't be scheduled on nodes because of resource constraints. The cluster then automatically increases the number of nodes.
The horizontal pod autoscaler uses the Metrics Server in a Kubernetes cluster to monitor the resource demand of pods. If an application needs more resources, the number of pods is automatically increased to meet the demand.
On AKS you can adjust a bit your Autoscaler Profile to change some default values. More detail can be found in Using the autoscaler profile
I would suggest you to read the Understanding Kubernetes Cluster Autoscaling article which explains how CA works. Under Limitations part you have information:
The cluster autoscaler doesn’t take into account actual CPU/GPU/Memory usage, just resource requests and limits. Most teams overprovision at the pod level, so in practice we see aggressive upscaling and conservative downscaling.
Conclusion
Cluster Autoscaler doesn't consider actual resources usage. CA downscale or upscale might take a few minutes depending on cloud provider.
I need to create K8s autoscale setup for spark application which will be running - on premise and AWS both as docker images.By scale, I mean (scale up and down of nodes) from on-premise to AWS cloud using cluster autoscaler or by other means
I browsed so many articles like how to set up K8 cluster on AWS/ HPA & CA scaling but could not get concrete directions to follow
I am looking for any direction which can help me understand from where i should start/steps to follow to setup such K8s cluster.
Regarding Cluster Autoscaler:
Cluster Autoscaler is a tool that automatically adjusts the size of the Kubernetes cluster when one of the following conditions is true:
- there are pods that failed to run in the cluster due to insufficient resources,
- there are nodes in the cluster that have been underutilized for an extended period of time and their pods can be placed on other existing nodes.
The cluster autoscaler on Azure dynamically scales Kubernetes worker nodes. It runs as a deployment in your cluster.
This README will help you get cluster autoscaler running on your Azure Kubernetes cluster.
Regarding HPA:
The Horizontal Pod Autoscaler automatically scales the number of pods in a replication controller, deployment or replica set based on observed CPU utilization or other custom metrics. HPA normally fetches metrics from a series of aggregated APIs:
- metrics.k8s.io
- custom.metrics.k8s.io
- external.metrics.k8s.io
Metrics-server needs to be launched separately if you wish to base on something more than just CPU utilization. More info can be found here and here.
How to make it work?
HPA is being supported by kubectl by default:
kubectl create - creates a new autoscaler
kubectl get hpa - lists your autoscalers
kubectl describe hpa - gets a detailed description of autoscalers
kubectl delete - deletes an autoscaler
Example:
kubectl autoscale rs foo --min=2 --max=5 --cpu-percent=80 creates an autoscaler for replication set foo, with target CPU utilization set to 80% and the number of replicas between 2 and 5.
Here is a detailed documentation of how to use kubectl autoscale command.