GCP Kubernetes scale too high - kubernetes

I have Kubernetes cluster hosted on GCP (Master version: 1.12.7-gke.7, Node version: 1.12.7-gke.7).
Recently i noticed that too many nodes are created, without any stress to the system. My expected average number of nodes is 30 but actually after unwanted scale up it goes to something around 60.
I tried to investigate this issue with
kubectl get hpa
and saw that the average CPU is near 0% - no scaling should be occur here.
Also checked
kubectl get deployments
and saw that the DESIRED number of pods is equal to the AVAILABLE - so the system don't asked for more resources.
After inspecting the node utilization I saw that around 25 nodes utilize only 200 mCPU which is very low consumption (5% of the node potential).
After a while, the cluster is back to the normal (around 30 nodes) without any significant event.
What's going on here? what I should check next?

The Horizontal Pod Autoscaler automatically scales the number of pods. So alone it can't be responsible for scaling the nodes. However if you have enabled cluster autoscaler this could be possible. Now to debug what is going on you would need logs from your master node, which you have no access to in GKE because it is maintained by google.
In this case my advice is to contact Google Cloud Support.

Related

GKE Cluster with 0 Node and Autopilot Enabled

I use GKE for years and I wanted to experiment with GKE with AutoPilot mode, and my initial expectation was, it starts with 0 worker nodes, and whenever I deploy a workload, it automatically scales the nodes based on requested memory and CPU. However, I created a GKE Cluster, there is nothing related to nodes in UI, but in kubectl get nodes output I see there are 2 nodes. Do you have any idea how to start that cluster with no node initially?
The principle of GKE autopilot is NOT TO worry about the node, it's managed for you. No matter if there is 1, 2 or 10 node to your cluster, you don't pay for them, you pay only when a POD run in your cluster (CPU and Memory time usage).
So, you can't handle the number of node, number of pools and low level management like that, something similar to serverless product (Google prefers saying "nodeless" cluster)
At the opposite, it's great to already have resources provisioned that you don't pay on your cluster, you will deploy and scale quicker!
EDIT 1
You can have a look to the pricing. You have a flat fee of $74.40 per month ($0.10/hour) for the control plane. And then you pay your pods (CPU + Memory).
You have 1 free cluster per Billing account.

Is it possible to use Kubernetes autoscale on cron job pods

Some context: I have multiple cron jobs running daily, weekly, hourly and some of which require significant processing power.
I would like to add requests and limitations to these container cron pods to try and enable vertical scaling and ensure that the assigned node will have enough capacity when being initialized. This will prevent me from having to have multiple large node available at all times and also letting me modify how many crons I can run in parallel easily.
I would like to try and avoid timed scaling since the cron jobs processing time can increase as the application grows.
Edit - Additional Information :
Currently I am using Digital Ocean and utilizing it's UI for cluster autoscaling. I have it working with HPA's on deployments but not crons. Adding limits to crons does not trigger cluster autoscaling to my knowledge.
I have tried to enable HPA scaling with the cron but with no success. Basically it just sits on a pending status signalling that there is insufficient CPU available and does not generate a new node.
Does HPA scaling work with cron job pods and is there a way to achieve the same type of scaling?
HPA is used to scale more pods when pod loads are high, but this won't increase the resources on your cluster.
I think you're looking for cluster autoscaler (works on AWS, GKE and Azure) and will increase cluster capacity when pods can't be scheduled.
This is a Community Wiki answer so feel free to edit it and add any additional details you consider important.
As Dom already mentioned "this won't increase the resources on your cluster." Saying more specifically, it won't create an additional node as Horizontal Pod Autoscaler doesn't have such capability and in fact it has nothing to do with cluster scaling. It's name is pretty self-exlpanatory. HPA is able only to scale Pods and it scales them horizontaly, in other words it can automatically increase or decrease number of replicas of your "replication controller, deployment, replica set or stateful set based on observed CPU utilization (or, with custom metrics support, on some other application-provided metrics)" as per the docs.
As to cluster autoscaling, as already said by Dom, such solutions are implemented in so called managed kubernetes solutions such as GKE on GCP, EKS on AWS or AKS on Azure and many more. You typically don't need to do anything to enable them as they are available out of the box.
You may wonder how HPA and CA fit together. It's really well explained in FAQ section of the Cluster Autoscaler project:
How does Horizontal Pod Autoscaler work with Cluster Autoscaler?
Horizontal Pod Autoscaler changes the deployment's or replicaset's
number of replicas based on the current CPU load. If the load
increases, HPA will create new replicas, for which there may or may
not be enough space in the cluster. If there are not enough resources,
CA will try to bring up some nodes, so that the HPA-created pods have
a place to run. If the load decreases, HPA will stop some of the
replicas. As a result, some nodes may become underutilized or
completely empty, and then CA will terminate such unneeded nodes.

google kubernetes engine node idle timeout

We use GKE in one of our service which is autoscaled. The workload is variable and based on the workload the cluster scales upto hundreds of nodes. However i see that when the workload goes down many of the nodes which are idle still are alive for very long time and hence increasing our bill. Is there a setting we can do where we can specify a time after which a node will be terminated and removed from the cluster?
The Kubernetes scaling-down process typically includes a delay as a protection from peak traffic spikes that can eventually occurs while performing the resize.
As well, there are several aspect about the autoscaler to consider. Please check the following docs for details:
https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#how-does-scale-down-work
https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-types-of-pods-can-prevent-ca-from-removing-a-node
Furthermore, when using the GKE autoscaler, there are some constraints to take into account:
When scaling down, cluster autoscaler honors a graceful termination
period of 10 minutes for rescheduling the node's Pods onto a different
node before forcibly terminating the node.
Occasionally, cluster autoscaler cannot scale down completely and an extra node exists after scaling down. This can occur when required system Pods are scheduled
onto different nodes, because there is no trigger for any of those
Pods to be moved to a different node. See I have a couple of nodes
with low utilization, but they are not scaled down. Why?. To work
around this limitation, you can configure a Pod disruption budget.
Disclaimer: Comments and opinions are my own and not the views of my employer.

Kubernetes autoscaler: change min replica on specific day

I'm running a Kubernetes cluster (GCP) with 10 deployments. Each deployment is configured to auto scale on stress.
From my website statistics, I found that Monday is the day with the most load. I want to define Kubernetes deployment to have more min-replicas on this day.
Is this possible?
I read somewhere that I can run a cronjob script before and after this day and change the minimum number of machines. Is that the current way to do it? Is this safe? what if the cronjob wasn't fired? If this is the current way, please link me for some instruction how to do it.
Thanks!
You seem to be talking of two things here.
Pod autoscaling (Add more pods when load on existing pods increases) : HPA will help with this. If your workloads show a spike in CPU or memory and can handle horizontal scaling, then HPA would work fine.
Example : https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/
Now HPA can increase pods only if the cluster has enough nodes to schedule them.
If it is desired to have more nodes with more traffic and reduce them when traffic is low, a cluster autoscaler could be a good option.
https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler
Ofcourse, the scaling of nodes is not instantaneous as the autoscaler watches for pods which are in Pending state due to resource level constraints. After this, it requests additional nodes from the cloud provider and once these nodes join the cluster, workloads get scheduled.

GKE Cluster Autoscaler pre creates nodes on time base

I'm running a Kubernetes cluster on Google Cloud. Master version is 1.8.9-gke.1, nodes versions are the same, cluster autoscaler is enabled.
I started to notice that nodes are being created without any pending pods. Those nodes can then sit without running pods for 20-30 minutes, then workload will be allocated to them. This is usually happening before peak hours and looks like cluster is trying to predict load base on load in the past.
Is it something that Google Cloud Cluster Autoscaler managing or is it Kubernetes feature? Is it configurable? I was trying to find any clue in the documentation, but without luck.
It sounds like something the Cluster Autoscaler would do.
Go to Stackdriver Logging and query with advanced filter:
resource.type="k8s_cluster"
resource.labels.cluster_name="<your_cluster_name>"
resource.labels.location="<your_cluster_zone>"
protoPayload.methodName="io.k8s.core.v1.nodes.update"
(The last line alone might suffice). I think this should be the way to get the autoscaler logs. If this does not work, let me know.