Kubernetes - Set Pod replication criteria based on memory and cpu usage - kubernetes

I am newbie to Kubernetes world. Please excuse if I am getting anything wrong.
I understand that pod replication is handled by k8s itself. We can also set cpu and memory usage for pods. But is it possible to change replication criteria based on memory and cpu usage? For example if I want to a pod to replicate when its memory/cpu usage reaches 70%.
Can we do it using metrics collected by Prometheus etc ?

You can use horizontal pod autoscaler. From the docs
The Horizontal Pod Autoscaler automatically scales the number of Pods
in a replication controller, deployment, replica set or stateful set
based on observed CPU utilization (or, with custom metrics support, on
some other application-provided metrics). Note that Horizontal Pod
Autoscaling does not apply to objects that can't be scaled, for
example, DaemonSets.
The Horizontal Pod Autoscaler is implemented as a Kubernetes API
resource and a controller. The resource determines the behavior of the
controller. The controller periodically adjusts the number of replicas
in a replication controller or deployment to match the observed
average CPU utilization to the target specified by user
An example from the doc
The following command will create a Horizontal Pod Autoscaler that maintains between 1 and 10 replicas of the Pods. HPA will increase and decrease the number of replicas to maintain an average CPU utilization across all Pods of 50%.
kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10

Related

What is the relationship between the HPA and ReplicaSet in Kubernetes?

I can't seem to find an answer to this but what is the relationship between an HPA and ReplicaSet? From what I know we define a Deployment object which defines replicas which creates the RS and the RS is responsible for supervising our pods and scale up and down.
Where does the HPA fit into this picture? Does it wrap over the Deployment object? I'm a bit confused as you define the number of replicas in the manifest for the Deployment object.
Thank you!
When we create a deployment it create a replica set and number of pods (that we gave in replicas). Deployment control the RS, and RS controls pods. Now, HPA is another abstraction which give the instructions to deployment and through RS make sure the pods fullfil the respective scaling.
As far the k8s doc: The Horizontal Pod Autoscaler automatically scales the number of Pods in a replication controller, deployment, replica set or stateful set based on observed CPU utilization (or, with custom metrics support, on some other application-provided metrics). Note that Horizontal Pod Autoscaling does not apply to objects that can't be scaled, for example, DaemonSets.
A brief high level overview is: Basically it's all about controller. Every k8s object has a controller, when a deployment object is created then respective controller creates the rs and associated pods, rs controls the pods, deployment controls rs. On the other hand, when hpa controllers sees that at any moment number of pods gets higher/lower than expected then it talks to deployment.
Read more from k8s doc

Can Horizontal Pod Scaling work with one node only?

I'm new to Kubernetes, and I have a doubt about horizontal pod autoscaling. Can I apply HPA with just one node ? If so, what are the benefits of HPA using one node only ?
If I use the metrics below, the target says averageUtilization 50% of cpu. Does that imply that I need a new node after the value is reached ?
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
Any advice ?
Here are some notes that might help you to sort things out:
Yes, you can use horizontal pod autoscaling on one node only.
The benefit of running multiple pods is parallelism: More instances of your app can handle more load - in that regard it doesn't matter if you run the pods on one or several nodes.
But if you have more pods of your application, you might end up in a situation where you need additional nodes to handle the load.
To determine out how many pods can run on one node, kubernetes uses the concept of resource limits and requests.
HPA will spawn new pods if the actual utilization of your pod hits the target utilization - but it doesn't take care that your node can handle more pods - you need to configure this using resource limits and requests.
Scaling up the nodes of your cluster is not handled by HPA, you need to use the kubernetes cluster autoscaler for that.

Kubernetes nodes with low CPU utilisation

I have a node pool for one deployment with 200-1000 pods. They're set with a CPU based HPA.
When the HPA scales down the deployment, it removes pods randomly, and eventually, I have an under-utilized node pool. The nodes aren't scaled down correctly because they still have at least one pod running.
I tried to find a solution and failed. Possible solutions, in my opinion:
HPA is aware of nodes utilization.
A PodDisruptionBudget for nodes?
Drain node if its CPU utilization is under a threshold.
Any help will be much appreciated.

How to exclude some containers' metrics in Kubernetes Horizontal Pod Autoscaling

I have a pod running with two containers. The actual application is running in one of the containers (container-app) and the other one is the proxy container (container-proxy). I enabled the Horizontal Pod Autoscaler (HPA) for CPU usage percentage but as it states in HPA documentation, both of the container metrics are put in the calculation.
I want to exclude the CPU metrics of container-proxy from HPA calculation because I want only application container to be the scaling element for the pod.
Is there any way to exclude some containers metrics from HPA calculation for multi-container pods?
The cluster autoscaler works on a per-node pool basis. Horizontal Pod Autoscaler monitors CPU utilization of the pods and scales the number of replicas automatically. It provides immediate efficiency and capacity when needed, operates within user-defined minimum/maximum bounds, and allows users to set it and forget it. The design of the horizontal autoscaler is for pods not for the individual container.
HPA calculates pod cpu utilization as total cpu usage of all containers in pod divided by total request. It does not exclude containers metrics from HPA calculation if multiple containers are inside the pod.
Kubernetes 1.20+ supports container metrics, so as to target the utilisation per container, which would allow excluding a specific container of a pod from being considered.
https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#container-resource-metrics
type: ContainerResource
containerResource:
name: cpu
container: application
target:
type: Utilization
averageUtilization: 60
Its an alpha feature though, so not available without turning on alpha features in Kubernetes.

What metrics can be fetched from metrics-server for Horizontal Pod Autoscaling

I am working on a use case related to Horizontal Pod Autoscaling. I am able to fetch memory and CPU usage from the metrics server in order to decide on scale out (found this after reading multiple blogs).
I wish to know if any of the other standard metrics such as throughput, disk usage, resource consumption etc. can be fetched from the metrics server. Have not been able to find anything on the same.
You can find all available metrics from Documentation of kube-state-metrics for all available resources.
Also, as mentioned in Horizontal Pod Autoscaler documentation you can use custom metrics
The Horizontal Pod Autoscaler automatically scales the number of pods
in a replication controller, deployment or replica set based on
observed CPU utilization (or, with custom metrics support, on some
other application-provided metrics). Note that Horizontal Pod
Autoscaling does not apply to objects that can’t be scaled, for
example, DaemonSets.