I am using Google Container Engine . Now I want auto scaling functionality in my cluster . As per documentation GKE autoscaler is in beta release . I can also enable autoscaling in instance group that is managing cluster nodes .
Cluster autoscaler add/remove nodes so that all scheduled pods have a place to run where instance group add/remove nodes based on different policies like average cpu utilization .
I think by adjusting pods CPU limit and target CPU utilization for pods in Kubernetes autoscaler , Managed Instance Group autoscaling can also be used to resize GKE cluster .
So my question is what should I use ?
Short answer - don't use GCE MIG autoscaling feature. It will just not work properly with your cluster.
See details in this FAQ:
https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#should-i-use-a-cpu-usage-based-node-autoscaler-with-kubernetes
(read the question linked above and 2 next ones)
As per GCP docs :
"Caution: Do not enable Compute Engine autoscaling for managed instance groups for your cluster nodes. GKE's cluster autoscaler is separate from Compute Engine autoscaling. This can lead to node pools failing to scale up or scale down as the Compute Engine autoscaler will be in conflict with GKE's cluster autoscaler"
More Details :
https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-autoscaler
Related
I have a project on Kubernetes and only 1 pod is running on each node and it has to be like that. How can I autoscale the nodes I want to build? So how can I create a new node when the load on the node increases. I am using AWS and Azure.
I am not sure why you want to run a pod per node. We have separate groups of applications and each group runs on certain node groups. We use cluster autoscaler to scale down/up nodes based on usage. We then use taints and tolerations on pods/nodes for each group of application. Cluster auto scaler will scale up nodes for the specific application group is pods are in pending status.
added below portion -
One useful article about scaling in kubernetes is here. You can read about Cluster Autoscaler (CA) as well.
I think you can try this with ClusterAutoscaler:
https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler
I have a GKE cluster which doesn't scale up when a particular deployment needs more resources.
I've checked the cluster autoscaler logs and it has entries with this error:
no.scale.up.nap.pod.zonal.resources.exceeded. The documentation for this error says:
Node auto-provisioning did not provision any node group for the Pod in
this zone because doing so would violate resource limits.
I don't quite understand which resource limits are mentiond in the documentation and why it prevents node-pool from scaling up?
If I scale cluster up manually - deployment pods are scaled up and everything works as expected, so, seems it's not a problem with project quotas.
Limits for clusters that you define are enforced based on the total CPU and memory resources used across your cluster, not just auto-provisioned pools.
When you are not using node auto provisioning (NAP), disable node auto provisioning feature for the cluster.
When you are using NAP, then update the cluster wide resource limits defined in NAP for the cluster .
Try a workaround by specifying the machine type explicitly in the workload spec. Ensure to use a supported machine family with GKE node auto-provisioning
Now in GKE there is new tab while creating new K8s cluster
Automation - Set cluster-level criteria for automatic maintenance, autoscaling, and auto-provisioning. Edit the node pool for automation like auto-scaling, auto-upgrades, and repair.
it has two options - Balanced (default) & Optimize utilization (beta)
cant we set this for older cluster any work around?
we are running old GKE version 1.14 we want to auto-scale cluster when 70% of resource utilization of existing nodes.
Currently, we have 2 different pools - only one has auto node provisioning enable but during peak hour if HPA scales POD, New node taking some time to join the cluster and sometimes exiting node start crashing due to resource pressure.
You can set the autoscaling profile by going into:
GCP Cloud Console (Web UI) -> Kubernetes Engine -> CLUSTER-NAME -> Edit -> Autoscaling profile
This screenshot was made on GKE version 1.14.10-gke.50
You can also run:
gcloud beta container clusters update CLUSTER-NAME --autoscaling-profile optimize-utilization
The official documentation states:
You can specify which autoscaling profile to use when making such decisions. The currently available profiles are:
balanced: The default profile.
optimize-utilization: Prioritize optimizing utilization over keeping spare resources in the cluster. When selected, the cluster autoscaler scales down the cluster more aggressively: it can remove more nodes, and remove nodes faster. This profile has been optimized for use with batch workloads that are not sensitive to start-up latency. We do not currently recommend using this profile with serving workloads.
-- Cloud.google.com: Kubernetes Engine: Cluster autoscaler: Autoscaling profiles
This setting (optimize-utilization) could not be the best option when using it for serving workloads. It will more aggressively try to scale-down (remove a node). It will automatically reduce the amount of available resources your cluster is having and could be more vulnerable to workload spikes.
Answering the part of the question:
we are running old GKE version 1.14 we want to auto-scale cluster when 70% of resource utilization of existing nodes.
As stated in the documentation:
Cluster autoscaler increases or decreases the size of the node pool automatically, based on the resource requests (rather than actual resource utilization) of Pods running on that node pool's nodes. It periodically checks the status of Pods and nodes, and takes action:
If Pods are unschedulable because there are not enough nodes in the node pool, cluster autoscaler adds nodes, up to the maximum size of the node pool.
-- Cloud.google.com: Kubernetes Engine: Cluster autoscaler: How cluster autoscaler works
You can't directly scale the cluster based on the percentage of resource utilization (70%).
Autoscaler bases on inability of the cluster to schedule pods on currently existing nodes.
You can scale the amount of replicas of your Deployment by CPU usage with Horizontal Pod Autoscaler. This Pods could have a buffer to handle increased amount of traffic and after a specific threshold they could spawn new Pods where the CA( Cluster autoscaler) would send a request for a new node (if new Pods are unschedulable). This buffer would be the mechanism to prevent sudden spikes that application couldn't manage.
The buffer part and over-provisioning explained in details in:
Cloud.google.com: Solutions: Best practices for running cost effective kubernetes applications on gke: Autoscaler and over-provisioning
There is an extensive documentation about running cost effective apps on GKE:
Cloud.google.com: Solutions: Best practices for running cost effective kubernetes applications on gke
I encourage you to check above link as there are a lot of tips and insights on (scaling, over-provisioning, workload spikes, HPA, VPA,etc.)
Additional resources:
Cloud.google.com: Kubernetes Engine: Node auto provisioning
I am using kubernete to manage docker cluster. Right now, I can set up POD autoscale using Horizontal Pod Scaler, that is fine.
And now I think the next step is to autoscale nodes. I think for HPA, the auto-created pod is only started in the already created nodes, but if all the available nodes are utilized and no available resource for any more pods, I think the next step is to automatically create node and have node join the k8s master.
I googled a lot and there are very limited resources to introduce this topic.
Can anyone please point me to any resource how to implement this requirement.
Thanks
One way to do using AWS and setting up your own Kubernetes cluster is by following these steps :
Create an Instance greater than t2.micro (will be master node).
Initialize the Kubernetes cluster using some tools like Kubeadm. After the initialisation would be completed you would get a join command, which needs to e run on all the nodes who want to join the cluster. (Here is the link)
Now create an Autoscaling Group on AWS with start/boot script containing that join command.
Now whenever the utilisation specified by you in autoscaling group is breached the scaling would happen and the node(s) would automatically join the Kubernetes cluster. This would allow the Kubernetes to schedule pods on the newly joined nodes based on the HPA.
(I would suggest to use Flannel as pod network as it automatically removes the node from Kubernetes cluster when it is not available)
kubernetes operations (kops) helps you create, destroy, upgrade and maintain production-grade, highly available, Kubernetes clusters from the command line.
Features:
Automates the provisioning of Kubernetes clusters in AWS and GCE
Deploys Highly Available (HA) Kubernetes Masters
Most of the managed kubernetes service providers provide auto scaling feature of the nodes
Elastic Kubernetes Service EKS- configure cluster auto scalar
Google Kubernetes Engine
GKE Auto Scalar
Auto scaling feature needs to be supported by the underlying cloud provider. Google cloud supports auto scaling during cluster creation or update by passing flags --enable-autoscaling --min-nodes and --max-nodes to the corresponding gcloud commands.
Examples:
gcloud container clusters create mytestcluster --zone=us-central1-b --enable-autoscaling --min-nodes=3 --max-nodes=10 --num-nodes=5
gcloud container clusters update mytestcluster --enable-autoscaling --min-nodes=1 --max-nodes=15
below link would be helpful
https://medium.com/kubecost/understanding-kubernetes-cluster-autoscaling-675099a1db92
I have deployed an app using Kubernetes to a Google Cloud Container Engine Cluster.
I got into autoscaling, and I found the following options:
Kubernetes Horizontal Pod Autoscaling (HPA)
As explained here, Kubernetes offers the HPA on deployments. As per the docs:
Horizontal Pod Autoscaling automatically scales the number of pods in a replication controller, deployment or replica set based on observed CPU utilization
Google Cloud Container Cluster
Now I have a Google Cloud Container Cluster using 3 instances, with autoscaling enabled. As per the docs:
Cluster Autoscaler enables users to automatically resize clusters so that all scheduled pods have a place to run.
This means I have two places to define my autoscaling. Hence my questions:
Is a Pod the same as VM instance inside my cluster, or can multiple Pod's run inside a single VM instance?
Are these two parameters doing the same (aka creating/removing VM instances inside my cluster). If not, what is their behaviour compared to one another?
What happens if e.g. I have a number of pods between 3 and 10 and a cluster with number of instances between 1 and 3 and autoscaling kicks in. When and how would both scale?
Many thanks!
Is a Pod the same as VM instance inside my cluster, or can multiple
Pod's run inside a single VM instance?
Multiple Pods can run the same instance (called node in kuberenetes). You can define maximum resources to consume for a POD in the deployment yaml. See the docs. This is an important prerequisite for autoscaling.
Are these two parameters doing the same (aka creating/removing VM
instances inside my cluster). If not, what is their behaviour compared
to one another?
Kubernetes autoscaler will schedule additional PODs in your existing nodes. Google autoscaler will add worker nodes (new instances) to your cluster. Google autoscaler looks at queued up PODs that cannot be scheduled because there is no space in your cluster and when it finds those will add nodes.
What happens if e.g. I have a number of pods between 3 and 10 and a
cluster with number of instances between 1 and 3 and autoscaling kicks
in. When and how would both scale?
By the maximum resource usage you define for your pods google autoscaler will estimate how many new nodes are required to run all queued up, scheduled pods.
Also read this article.