I'm defining this autoscaler with kubernetes and GCE and I'm wondering what exactly should I specify for targetCPUUtilizationPercentage. That target points to what exactly? Is it the total CPU in my cluster? When the pods referenced in this autoscaler consume more than targetCPUUtilizationPercentage what happens?
The CPU utilization is the average CPU usage of a all pods in a deployment across the last minute divided by the requested CPU of this deployment. If the mean of the pods' CPU utilization is higher than the target you defined, the your replicas will be adjusted.
You can read more about this topic here.
This is average cpu utilisation of all the pods, so if you have given CPU as 200 in the resource requests and targetCPUUtilizationPercentage as 80%, then at 160 value of threshold, it will scale out the pod. It will create a new repliace.
Related
I have a EKS cluster running with cluster-autoscaler version 1.21.2 deployed. When I did a kubectl top nodes, I found a node using 5% cpu and 21% memory utilised. But in cluster-autoscaler pod log, I see below message for the same node:
Node XXXX is not suitable for removal - cpu utilization too big (0.663130)
I'm now confused how is cluster autoscaler calculating this value and why is the node not scaled down. BTW, I used default config of --scale-down-utilization-threshold=0.5
We stumbled upon the same issue, and realized that the CPU utilization value (in your case 66,31%) matches roughly the amount of CPU requested by the pods/containers running on the node.
Remember: Requested CPU (and other resources) by a pod/container is given guaranteed.
This is why it sounds logical to us that when looking at the node's actual CPU usage, it might be idle, though from a Kubernetes autoscaling perspective, the node uses 66% from the CPU.
i set hpa for my deployment/app, for example, CPU 80%.
my app deployment has two containers, one is app for traffic, the other is automatically injected istio-proxy.
when i get hpa during running traffic, i found something unexpected for the hpa result.
the cpu request of istio-proxy is 2G.
the cpu request of app is 4G.
the cpu consumed of istio-proxy is 1G.
the cpu consumed of app is 4G.
so, i expected the hpa of this pod (including 2 containers) is (1+2)/(2+4) = 50%.
but the actual result is close to (1+2)/4 = 75%.
it seems the istio-proxy's cpu request is excluded from calculating cpu utilization of hpa.
as i know, k8s get cpu requests from deployment, but actually for this sidecar auto injection case, the deployment yaml doesn't have any istio-proxy container information.
i guess that's why the istio-proxy cpu request is excluded.
but is that the expected behavior or a bug ?
I think as of 1.19, the hpa works on an average value of all containers in the pods. The exact logic is here : https://github.com/kubernetes/kubernetes/blob/v1.9.0/pkg/controller/podautoscaler/metrics/utilization.go#L49
currentUtilization = int32((metricsTotal * 100) / requestsTotal)
As per the above logic HPA is calculating pod cpu utilization as total cpu usage of all containers in pod divided by total request
I am newbie to Kubernetes world. Please excuse if I am getting anything wrong.
I understand that pod replication is handled by k8s itself. We can also set cpu and memory usage for pods. But is it possible to change replication criteria based on memory and cpu usage? For example if I want to a pod to replicate when its memory/cpu usage reaches 70%.
Can we do it using metrics collected by Prometheus etc ?
You can use horizontal pod autoscaler. From the docs
The Horizontal Pod Autoscaler automatically scales the number of Pods
in a replication controller, deployment, replica set or stateful set
based on observed CPU utilization (or, with custom metrics support, on
some other application-provided metrics). Note that Horizontal Pod
Autoscaling does not apply to objects that can't be scaled, for
example, DaemonSets.
The Horizontal Pod Autoscaler is implemented as a Kubernetes API
resource and a controller. The resource determines the behavior of the
controller. The controller periodically adjusts the number of replicas
in a replication controller or deployment to match the observed
average CPU utilization to the target specified by user
An example from the doc
The following command will create a Horizontal Pod Autoscaler that maintains between 1 and 10 replicas of the Pods. HPA will increase and decrease the number of replicas to maintain an average CPU utilization across all Pods of 50%.
kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
I have a node pool for one deployment with 200-1000 pods. They're set with a CPU based HPA.
When the HPA scales down the deployment, it removes pods randomly, and eventually, I have an under-utilized node pool. The nodes aren't scaled down correctly because they still have at least one pod running.
I tried to find a solution and failed. Possible solutions, in my opinion:
HPA is aware of nodes utilization.
A PodDisruptionBudget for nodes?
Drain node if its CPU utilization is under a threshold.
Any help will be much appreciated.
I have a pod running with two containers. The actual application is running in one of the containers (container-app) and the other one is the proxy container (container-proxy). I enabled the Horizontal Pod Autoscaler (HPA) for CPU usage percentage but as it states in HPA documentation, both of the container metrics are put in the calculation.
I want to exclude the CPU metrics of container-proxy from HPA calculation because I want only application container to be the scaling element for the pod.
Is there any way to exclude some containers metrics from HPA calculation for multi-container pods?
The cluster autoscaler works on a per-node pool basis. Horizontal Pod Autoscaler monitors CPU utilization of the pods and scales the number of replicas automatically. It provides immediate efficiency and capacity when needed, operates within user-defined minimum/maximum bounds, and allows users to set it and forget it. The design of the horizontal autoscaler is for pods not for the individual container.
HPA calculates pod cpu utilization as total cpu usage of all containers in pod divided by total request. It does not exclude containers metrics from HPA calculation if multiple containers are inside the pod.
Kubernetes 1.20+ supports container metrics, so as to target the utilisation per container, which would allow excluding a specific container of a pod from being considered.
https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#container-resource-metrics
type: ContainerResource
containerResource:
name: cpu
container: application
target:
type: Utilization
averageUtilization: 60
Its an alpha feature though, so not available without turning on alpha features in Kubernetes.