My AKS cluster is always low in utilisation. I can see its using 20-30% of CPU/Memory and using extra nodes and never scaling down on the nodes. Its running an airflow instance which can spike utilization to 60% when running DAGs for short bursts.
How do I make sure it doesn't use extra nodes so we can manage costs effectively?
Following link would be helpful -> https://learn.microsoft.com/en-us/azure/aks/cluster-autoscaler
Related
Some context: I have multiple cron jobs running daily, weekly, hourly and some of which require significant processing power.
I would like to add requests and limitations to these container cron pods to try and enable vertical scaling and ensure that the assigned node will have enough capacity when being initialized. This will prevent me from having to have multiple large node available at all times and also letting me modify how many crons I can run in parallel easily.
I would like to try and avoid timed scaling since the cron jobs processing time can increase as the application grows.
Edit - Additional Information :
Currently I am using Digital Ocean and utilizing it's UI for cluster autoscaling. I have it working with HPA's on deployments but not crons. Adding limits to crons does not trigger cluster autoscaling to my knowledge.
I have tried to enable HPA scaling with the cron but with no success. Basically it just sits on a pending status signalling that there is insufficient CPU available and does not generate a new node.
Does HPA scaling work with cron job pods and is there a way to achieve the same type of scaling?
HPA is used to scale more pods when pod loads are high, but this won't increase the resources on your cluster.
I think you're looking for cluster autoscaler (works on AWS, GKE and Azure) and will increase cluster capacity when pods can't be scheduled.
This is a Community Wiki answer so feel free to edit it and add any additional details you consider important.
As Dom already mentioned "this won't increase the resources on your cluster." Saying more specifically, it won't create an additional node as Horizontal Pod Autoscaler doesn't have such capability and in fact it has nothing to do with cluster scaling. It's name is pretty self-exlpanatory. HPA is able only to scale Pods and it scales them horizontaly, in other words it can automatically increase or decrease number of replicas of your "replication controller, deployment, replica set or stateful set based on observed CPU utilization (or, with custom metrics support, on some other application-provided metrics)" as per the docs.
As to cluster autoscaling, as already said by Dom, such solutions are implemented in so called managed kubernetes solutions such as GKE on GCP, EKS on AWS or AKS on Azure and many more. You typically don't need to do anything to enable them as they are available out of the box.
You may wonder how HPA and CA fit together. It's really well explained in FAQ section of the Cluster Autoscaler project:
How does Horizontal Pod Autoscaler work with Cluster Autoscaler?
Horizontal Pod Autoscaler changes the deployment's or replicaset's
number of replicas based on the current CPU load. If the load
increases, HPA will create new replicas, for which there may or may
not be enough space in the cluster. If there are not enough resources,
CA will try to bring up some nodes, so that the HPA-created pods have
a place to run. If the load decreases, HPA will stop some of the
replicas. As a result, some nodes may become underutilized or
completely empty, and then CA will terminate such unneeded nodes.
I'm trying(learning) to figure out the best way to utilize CPU (and RAM) on k8s nodes.
My final goal is to make sure CPU utilization on each node in the cluster is above X%
Till now I've read about cluster-autoscaler and HPA, but not sure if they'd help me with the use case.
From what I've read:
cluster-autoscaler is used to autoscale nodes based on a comparison between replica count and resources.request Vs available CPU on the target ec2 instance - which is NOT based on traffic/actual CPU usage
HPA is based on CPU/actual cpu usage, but for individual pods
I essentially wanna get to a point where kubectl top nodes would show all nodes are using > X% (let's say 60%) - and ideally trigger the autoscaling if we reach X2% (let's say 80%)
any suggestion/pointer on how to go about this use case? (or I should somehow use the combination of these 2 autoscaling mechanisms)
You can a combination of the HPA or/and Cluster autoscaler and/or the cloud providers' autoscaling group.
HPA based on CPU/Memory of your pods and scale up and down your K8s Deployments for example.
Cloud provider ASG or autoscaling group. Using the VMs or instances based and you can scale up and down based on their own CPU and memory metrics.
Cluster autoscaler. It works when pods are pending and they have nowhere to run, but if you are handling the case above this is more of a safe fail mechanism or perhaps for workloads that don't require to come up very quickly.
In summary, you can use all 3 above (or less) but you have to see what works for you so that they don't conflict with each other. One potential problem is that when your cloud ASG starts scaling down then you also have pods in pending state then your cluster autoscaler (if you have it enabled) will kick in and you may have both of them trying to do the opposite causing your cluster to just not being able to schedule any pod.
✌️☮️
I'm trying to figure out why a GKE "Workload" CPU usage is not equivalent to the sum of cpu usage of its pods.
Following image shows a Workload CPU usage.
Service Workload CPU Usage
Following images show pods CPU usage for the above Workload.
Pod #1 CPU Usage
Pod #2 CPU Usage
For example, at 9:45, the Workload cpu usage was around 3.7 cores, but at the same time Pod#1 CPU usage was around 0.9 cores and Pod#2 CPU usage was around 0.9 cores too. It means, the service Workload CPU Usage should have been around 1.8 cores, but it wasn't.
Does anyone have an idea of this behavior?
Thanks.
On your VM, the node managed by Kubernetes, you have the deployed pods (that you manage) but also several services that run on it for the supervision, the management, the logs ingestion,... A basic description here
You can see all these basic services by performing this command kubctl get all --namespace kube-system.
If you have installed additional components, like Istio or Knative, you have additional services and namespaces. All of these get a part of the resources of the node.
Danny,
The CPU chart on the Workloads page is an aggregate of CPU usage for managed pods. The values are taken from the Stackdriver Monitoring metric container/cpu/usage_time, check this link. That metric represents "Cumulative CPU usage on all cores in seconds. This number divided by the elapsed time represents usage as a number of cores, regardless of any core limit that might be set."
Please let me know if you have further questions in regard to this.
I suspect this is a bug in the UI. There is no actual metric for deployment CPU usage. Stackdriver Monitoring only collects data on container, pod, and node level metrics thus the only really reliable metrics in this case are the ones for pod CPU usage.
The graph for the total deployment CPU usage is likely meant to be a sum of all the pods metrics calculated and then presented to you. It is not as reliable as the pod or container metrics since it is not a direct metric.
If you are seeing this discrepancy consistently, I recommend opening a UI bug report through the Google Public Issue Tracker to report this to the GCP Engineers.
I have Kubernetes cluster hosted on GCP (Master version: 1.12.7-gke.7, Node version: 1.12.7-gke.7).
Recently i noticed that too many nodes are created, without any stress to the system. My expected average number of nodes is 30 but actually after unwanted scale up it goes to something around 60.
I tried to investigate this issue with
kubectl get hpa
and saw that the average CPU is near 0% - no scaling should be occur here.
Also checked
kubectl get deployments
and saw that the DESIRED number of pods is equal to the AVAILABLE - so the system don't asked for more resources.
After inspecting the node utilization I saw that around 25 nodes utilize only 200 mCPU which is very low consumption (5% of the node potential).
After a while, the cluster is back to the normal (around 30 nodes) without any significant event.
What's going on here? what I should check next?
The Horizontal Pod Autoscaler automatically scales the number of pods. So alone it can't be responsible for scaling the nodes. However if you have enabled cluster autoscaler this could be possible. Now to debug what is going on you would need logs from your master node, which you have no access to in GKE because it is maintained by google.
In this case my advice is to contact Google Cloud Support.
I'm very new to Kubernetes. We are using Kubernetes cluster on Google Cloud Platform.
I have created Cluster, Services, Pod, Replica controllers.
I have created Horizontal Pod Autoscaler and it is based on CPU Params.
Cluster details
Default running node count is set to 3
3GB allocatable memory per node
Default running node count is 3 in the cluster.
After running for 1 hour Service and Nodes showing NodeUnderMemoryPressure Issues.
How to resolve this ??
If you any more details, please ask
Thanks
I don't know how much traffic is hitting your cluster, but I would highly recommend running Prometheus in your cluster.
Prometheus is an open-source monitoring and alerting tool, and integrates very well with Kubernetes.
This tool should give you a much better view of memory consumption, CPU usage, amongst many other monitoring capabilities, that will allow you to effectively troubleshoot these types of issues.
There are several ways to address this issue that depends on the type of your workloads.
The easiest is simply scale your nodes, but it can be useless if there is a memory leakage. Even if now you are not affected by it you should always consider the possibility of a memory leakage happening, therefore the best practise is to introduce always memory limits for PODs and Namespaces.
Scale the cluster
if you have many pods running and there are not some of them way bigger that the others it would be useful to scale horizontally your cluster, in this way the number of running pods per nodes will reduce and the NodeUnderMemoryPressure warning should disappear.
if you are running few PODs or some of them are capable to make the cluster suffering alone, then the only option is to scale the nodes vertically adding a new node pool with Compute Engine instances having more memory and possibly delete the old one.
if your workload is correct and you memory suffer because in certain moment of the day you receive 100 times more the usual traffic and you create more pods to support this traffic, you should consider to make use of the Autoscaler.
Check Memory leakages
On the other hand if it is not an "healthy" situation and you have pods consuming way more RAM than expected then you should follow the advice of grizzthedj and understand why your PODs are consuming so much and maybe verify if some of your container is affected by memory leakage and in this case scale the amount of RAM is useless since at some point you will run out of it anyway.
Therefore start to understand which are the PODs consuming too much and then troubleshoot why they have this behaviour, if you do not want to make use of Prometeus simply SSH into the container and check with the classical Linux commands.
Limit the RAM consumed by PODs
To prevent this to happen in the future I advise you when writing YAML file to always limit the amount of RAM they can make use of, in this way you will control them and you will be sure that there is not the risk that they cause the Kubernetes "node agent" to fail because out of memory.
Consider also to limit the CPU and introduce minimum requirements of both RAM and CPU for PODs to help the scheduler to properly schedule the PODs to avoid to hit NodeUnderMemoryPressure under high workload.