How to exclude some containers' metrics in Kubernetes Horizontal Pod Autoscaling - kubernetes

I have a pod running with two containers. The actual application is running in one of the containers (container-app) and the other one is the proxy container (container-proxy). I enabled the Horizontal Pod Autoscaler (HPA) for CPU usage percentage but as it states in HPA documentation, both of the container metrics are put in the calculation.
I want to exclude the CPU metrics of container-proxy from HPA calculation because I want only application container to be the scaling element for the pod.
Is there any way to exclude some containers metrics from HPA calculation for multi-container pods?

The cluster autoscaler works on a per-node pool basis. Horizontal Pod Autoscaler monitors CPU utilization of the pods and scales the number of replicas automatically. It provides immediate efficiency and capacity when needed, operates within user-defined minimum/maximum bounds, and allows users to set it and forget it. The design of the horizontal autoscaler is for pods not for the individual container.
HPA calculates pod cpu utilization as total cpu usage of all containers in pod divided by total request. It does not exclude containers metrics from HPA calculation if multiple containers are inside the pod.

Kubernetes 1.20+ supports container metrics, so as to target the utilisation per container, which would allow excluding a specific container of a pod from being considered.
https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#container-resource-metrics
type: ContainerResource
containerResource:
name: cpu
container: application
target:
type: Utilization
averageUtilization: 60
Its an alpha feature though, so not available without turning on alpha features in Kubernetes.

Related

how to configure k8s autoscaling?

How should I configure it to proceed with automatic scale when the average cpu 20% of the pod is over? If the limited pod amount of nodes is exceeded when auto-scaling, how do I expand the nodes horizontally? Is there a way to automatically expand the pod without specifying max and min?
You need to deploy a cluster autoscaler and configure a horizontal pod autoscaler for your workload.

does GKE autopilot auto scale both pods and nodes?

when I change the replicas: x in my .yaml file I can see GKE autopilot boots pods up/down depending on the value, but what will happen if the load on my deployment gets too big. Will it then autoscale the number of pods and nodes to handle the traffic and then reduce back to the value specified in replicas when the request load is reduced again?
I'm basically asking how does autopilot horizontal autoscaling works?
and how do I get a minimum of 2 pod replicas that can horizontally autoscale in autopilot?
GKE autopilot by default will not scale the replicas count beyond what you specified. This is the default behavior of Kubernetes in general.
If you want automatic autoscaling you have to use Horizental Pod Autoscaler (HPA) which is supported in Autopilot
If you deploy HPA to scale up and down your workload, Autopilot will scale up and down the nodes automatically and that's transparent for you as the nodes are managed by Google.
GKE autoscale only Nodes by default, while you have to take care of your HPA deployment scaling.
Autopilot: GKE provisions and manages the cluster's underlying
infrastructure, including nodes and node pools, giving you an
optimized cluster with a hands-off experience.
We need to configure both scaling options for deployment VPA and HPA.
Pre-configured: Autopilot handles all the scaling and configuring of
your nodes.
Default: You configure Horizontal pod autoscaling (HPA) You configure
Vertical Pod autoscaling (VPA)
GKE will manage the scaling up/down of your nodes in node pools, without worrying about the infrastructure you just have to start deploying the application with HPA & VPA auto-scaling.
You can read more about the options here : https://cloud.google.com/kubernetes-engine/docs/concepts/autopilot-overview#comparison

Can Horizontal Pod Scaling work with one node only?

I'm new to Kubernetes, and I have a doubt about horizontal pod autoscaling. Can I apply HPA with just one node ? If so, what are the benefits of HPA using one node only ?
If I use the metrics below, the target says averageUtilization 50% of cpu. Does that imply that I need a new node after the value is reached ?
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
Any advice ?
Here are some notes that might help you to sort things out:
Yes, you can use horizontal pod autoscaling on one node only.
The benefit of running multiple pods is parallelism: More instances of your app can handle more load - in that regard it doesn't matter if you run the pods on one or several nodes.
But if you have more pods of your application, you might end up in a situation where you need additional nodes to handle the load.
To determine out how many pods can run on one node, kubernetes uses the concept of resource limits and requests.
HPA will spawn new pods if the actual utilization of your pod hits the target utilization - but it doesn't take care that your node can handle more pods - you need to configure this using resource limits and requests.
Scaling up the nodes of your cluster is not handled by HPA, you need to use the kubernetes cluster autoscaler for that.

Kubernetes - Set Pod replication criteria based on memory and cpu usage

I am newbie to Kubernetes world. Please excuse if I am getting anything wrong.
I understand that pod replication is handled by k8s itself. We can also set cpu and memory usage for pods. But is it possible to change replication criteria based on memory and cpu usage? For example if I want to a pod to replicate when its memory/cpu usage reaches 70%.
Can we do it using metrics collected by Prometheus etc ?
You can use horizontal pod autoscaler. From the docs
The Horizontal Pod Autoscaler automatically scales the number of Pods
in a replication controller, deployment, replica set or stateful set
based on observed CPU utilization (or, with custom metrics support, on
some other application-provided metrics). Note that Horizontal Pod
Autoscaling does not apply to objects that can't be scaled, for
example, DaemonSets.
The Horizontal Pod Autoscaler is implemented as a Kubernetes API
resource and a controller. The resource determines the behavior of the
controller. The controller periodically adjusts the number of replicas
in a replication controller or deployment to match the observed
average CPU utilization to the target specified by user
An example from the doc
The following command will create a Horizontal Pod Autoscaler that maintains between 1 and 10 replicas of the Pods. HPA will increase and decrease the number of replicas to maintain an average CPU utilization across all Pods of 50%.
kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10

What metrics can be fetched from metrics-server for Horizontal Pod Autoscaling

I am working on a use case related to Horizontal Pod Autoscaling. I am able to fetch memory and CPU usage from the metrics server in order to decide on scale out (found this after reading multiple blogs).
I wish to know if any of the other standard metrics such as throughput, disk usage, resource consumption etc. can be fetched from the metrics server. Have not been able to find anything on the same.
You can find all available metrics from Documentation of kube-state-metrics for all available resources.
Also, as mentioned in Horizontal Pod Autoscaler documentation you can use custom metrics
The Horizontal Pod Autoscaler automatically scales the number of pods
in a replication controller, deployment or replica set based on
observed CPU utilization (or, with custom metrics support, on some
other application-provided metrics). Note that Horizontal Pod
Autoscaling does not apply to objects that can’t be scaled, for
example, DaemonSets.