GKE Cluster Autoscaler pre creates nodes on time base - kubernetes

I'm running a Kubernetes cluster on Google Cloud. Master version is 1.8.9-gke.1, nodes versions are the same, cluster autoscaler is enabled.
I started to notice that nodes are being created without any pending pods. Those nodes can then sit without running pods for 20-30 minutes, then workload will be allocated to them. This is usually happening before peak hours and looks like cluster is trying to predict load base on load in the past.
Is it something that Google Cloud Cluster Autoscaler managing or is it Kubernetes feature? Is it configurable? I was trying to find any clue in the documentation, but without luck.

It sounds like something the Cluster Autoscaler would do.
Go to Stackdriver Logging and query with advanced filter:
resource.type="k8s_cluster"
resource.labels.cluster_name="<your_cluster_name>"
resource.labels.location="<your_cluster_zone>"
protoPayload.methodName="io.k8s.core.v1.nodes.update"
(The last line alone might suffice). I think this should be the way to get the autoscaler logs. If this does not work, let me know.

Related

Reduce costs in EKS cluster outside working hours

I have an EKS cluster with two worker nodes. I would like to "switch off" the nodes or do something to reduce costs of my cluster outside working hours. Is there any way to turn off the nodes at night and turn on again at morning?
Thanks a lot.
This is a very common concern with anyone using managed K8s cluster. There might be different approaches people might be taking for this. What works best for us is a combination of kube-downscaler and cluster-autoscaler.
kube-downscaler helps you to scale down / "pause" Kubernetes workload (Deployments, StatefulSets, and/or HorizontalPodAutoscalers and CronJobs too !) during non-work hours.
cluster-autoscaler is a tool that automatically:
Scales-down the size of the Kubernetes cluster when there are nodes in the cluster that have been underutilized for an extended period of time and their pods can be placed on other existing nodes.
Scales-up the size of the Kubernetes cluster when there are pods that failed to run in the cluster due to insufficient resources.
So, essentially during night when kube-downscaler scales down the pods and other objects, cluster-autoscaler notices the underutilized nodes and kill them before placing pods on other nodes. And does the opposite in the morning.
Ofcourse, there might be some fine-tuning needed regarding the configuration of the two to make it work best for you.
Unrelated to your specific question but, if you are in "savings" mode you may want to have a look at EC2 Spot Instances for EKS assuming you can operate within their boundaries. See here for the details.

Is it possible to schedule a pod to run for say 24 hours and then remove deployment/statefulset? or need to use jobs?

We have a bunch of pods running in dev environment. The pods are auto-provisioned by an application on every business action. The problem is that across various namespaces they are accumulating and eating available resources in EKS.
Is there a way without jenkins/k8s jobs to simply put some parameter on the pod manifest to tell it to self destruct say in 24 hours?
Add to your pod.spec:
activeDeadlineSeconds: 86400
After deadline your Pod will be stopped for good with the status DeadlineExceeded
If I understood your situation properly, you would like to scale your cluster down in order to save resources.
Kubernetes is featured with the ability to autoscale your application in a cluster. Literally, it means that Kubernetes can start additional pods when the load is increasing and terminate excessive pods when the load is decreasing.
It is possible to downscale the application to zero pods, but, in this case, you will have a delay serving the first request while the pod is starting.
This functionality relies on performance metrics. From the practical side, it means that autoscaling doesn't happen instantly, because it takes some time to performance metrics reach the configured threshold.
The mentioned Kubernetes feature called HPA(horizontal pod autoscale) is described in this document.
In case you are running your cluster on GCP or GKE, you are able to go further and automatically start additional nodes for your cluster when you need more computing capacity and shut down nodes when they are not running application pods anymore.
More information about this functionality can be found following the link.
Last, but not least, you can use tool like Ansible to manage all your kubernetes assets (it can create/manage deployments via playbooks).
If you decide to give it a try, you might find this information useful:
Creating a Container cluster in GKE
70% cheaper Kubernetes cluster on AWS
How to build a Kubernetes Horizontal Pod Autoscaler using custom metrics

Can you force GKE node autoscaling and how long should autoscaling take?

When running an autoscaling cluster on Google Cloud GKE, it is taking 15 minutes or sometimes half an hour with unschedulable pods before the autoscaler kicks in and provisions another node.
Especially in cases where I have manually deleted a node. The list of nodes shows as correct, but the number of nodes on the Node Cluster console will show as if the node was not deleted for at least 30 minutes.
Is there a way to force the autoscaler to take stock and make an upgrade immediately?
I have also tried turning off autoscaling and just setting a static number of nodes. But when deleting one of those nodes, it has not come back after waiting 45 minutes.
Is this expected behavior or is something up with GKE or do I potentially have something configured incorrectly?
I have checked and confirmed that I am not hitting up against any quotas. I have node auto-repair and autoscale both activated.
There is no way to force the autoscaler to go up (aside from manually changing the settings in the node pool). The autoscaler should scale up as soon as there are unschedulable pods as long as adding a new node will lead the pod to being scheduled.
Currently, only autoscaler actions are sent to Stackdriver so it's a lot harder to diagnose why an action was not taken. Your best bet would be to open a case with Google support if you have support or open an Issue Tracker to have a Googler check the logs for you.

GCP Kubernetes scale too high

I have Kubernetes cluster hosted on GCP (Master version: 1.12.7-gke.7, Node version: 1.12.7-gke.7).
Recently i noticed that too many nodes are created, without any stress to the system. My expected average number of nodes is 30 but actually after unwanted scale up it goes to something around 60.
I tried to investigate this issue with
kubectl get hpa
and saw that the average CPU is near 0% - no scaling should be occur here.
Also checked
kubectl get deployments
and saw that the DESIRED number of pods is equal to the AVAILABLE - so the system don't asked for more resources.
After inspecting the node utilization I saw that around 25 nodes utilize only 200 mCPU which is very low consumption (5% of the node potential).
After a while, the cluster is back to the normal (around 30 nodes) without any significant event.
What's going on here? what I should check next?
The Horizontal Pod Autoscaler automatically scales the number of pods. So alone it can't be responsible for scaling the nodes. However if you have enabled cluster autoscaler this could be possible. Now to debug what is going on you would need logs from your master node, which you have no access to in GKE because it is maintained by google.
In this case my advice is to contact Google Cloud Support.

Kubernetes automatic shutdown after some idle time

Does kubernetes or Helm support shut down the pods if it is idle for more than a given threshold time?
This would be very useful in the development environment, to provide room for other processes to consume it and save cost.
Kubernetes is featured with the ability to autoscale your application in a cluster. Literally, it means that Kubernetes can start additional pods when the load is increasing and terminate excessive pods when the load is decreasing.
It is possible to downscale the application to zero pods, but, in this case, you will have a delay serving the first request while the pod is starting.
This functionality relies on performance metrics provided by Heapster application, that must be run in the cluster. From the practical side, it means that autoscaling doesn't happen instantly, because it takes some time to performance metrics reach the configured threshold.
The mentioned Kubernetes feature called HPA(horizontal pod autoscale) is described in this document.
In case you are running your cluster on GCP or GKE, you are able to go further and automatically start additional nodes for your cluster when you need more computing capacity and shut down nodes when they are not running application pods anymore.
More information about this functionality can be found following the link.
If you decide to give it a try, you might find this information useful:
Creating a Container cluster in GKE
70% cheaper Kubernetes cluster on AWS
How to build a Kubernetes Horizontal Pod Autoscaler using custom metrics