GKE autoscaling doesn't scale - kubernetes

I am setting up a Kubernetes cluster on Google using the Google Kubernetes Engine. I have created the cluster with auto-scaling enabled on my nodepool.
As far as I understand this should be enough for the cluster to spin up extra nodes if needed.
But when I run some load on my cluster, the HPA is activated and wants to spin up some extra instances but can't deploy them due to 'insufficient cpu'. At this point I expected the auto-scaling of the cluster to kick into action but it doesn't seem to scale up. I did however see this:
So the node that is wanting to be created (I guess thanks to the auto-scaler?) can't be created with following message: Quota 'IN_USE_ADDRESSES' exceeded. Limit: 8.0 in region europe-west1.
I also didn't touch the auto-scaling on the instance group, so when running gcloud compute instance-groups managed list, it shows as 'autoscaled: no'
So any help getting this autoscaling to work would be appreciated.
TL;DR I guess the reason it isn't working is: Quota 'IN_USE_ADDRESSES' exceeded. Limit: 8.0 in region europe-west1, but I don't know how I can fix it.

You really have debugged it yourself already. You need to edit the Quotas on the GCP Console. Make sure you select the correct project. Increase all that are low: probably addresses and CPUs in the zone. This process is semi automated only, so you might need to wait a bit and possibly pay a deposit.

Related

Is it possible to schedule a pod to run for say 24 hours and then remove deployment/statefulset? or need to use jobs?

We have a bunch of pods running in dev environment. The pods are auto-provisioned by an application on every business action. The problem is that across various namespaces they are accumulating and eating available resources in EKS.
Is there a way without jenkins/k8s jobs to simply put some parameter on the pod manifest to tell it to self destruct say in 24 hours?
Add to your pod.spec:
activeDeadlineSeconds: 86400
After deadline your Pod will be stopped for good with the status DeadlineExceeded
If I understood your situation properly, you would like to scale your cluster down in order to save resources.
Kubernetes is featured with the ability to autoscale your application in a cluster. Literally, it means that Kubernetes can start additional pods when the load is increasing and terminate excessive pods when the load is decreasing.
It is possible to downscale the application to zero pods, but, in this case, you will have a delay serving the first request while the pod is starting.
This functionality relies on performance metrics. From the practical side, it means that autoscaling doesn't happen instantly, because it takes some time to performance metrics reach the configured threshold.
The mentioned Kubernetes feature called HPA(horizontal pod autoscale) is described in this document.
In case you are running your cluster on GCP or GKE, you are able to go further and automatically start additional nodes for your cluster when you need more computing capacity and shut down nodes when they are not running application pods anymore.
More information about this functionality can be found following the link.
Last, but not least, you can use tool like Ansible to manage all your kubernetes assets (it can create/manage deployments via playbooks).
If you decide to give it a try, you might find this information useful:
Creating a Container cluster in GKE
70% cheaper Kubernetes cluster on AWS
How to build a Kubernetes Horizontal Pod Autoscaler using custom metrics

GCP Kubernetes scale too high

I have Kubernetes cluster hosted on GCP (Master version: 1.12.7-gke.7, Node version: 1.12.7-gke.7).
Recently i noticed that too many nodes are created, without any stress to the system. My expected average number of nodes is 30 but actually after unwanted scale up it goes to something around 60.
I tried to investigate this issue with
kubectl get hpa
and saw that the average CPU is near 0% - no scaling should be occur here.
Also checked
kubectl get deployments
and saw that the DESIRED number of pods is equal to the AVAILABLE - so the system don't asked for more resources.
After inspecting the node utilization I saw that around 25 nodes utilize only 200 mCPU which is very low consumption (5% of the node potential).
After a while, the cluster is back to the normal (around 30 nodes) without any significant event.
What's going on here? what I should check next?
The Horizontal Pod Autoscaler automatically scales the number of pods. So alone it can't be responsible for scaling the nodes. However if you have enabled cluster autoscaler this could be possible. Now to debug what is going on you would need logs from your master node, which you have no access to in GKE because it is maintained by google.
In this case my advice is to contact Google Cloud Support.

GKE | Cluster won't provision in any region

I have a GKE cluster running in us-central1 with a preemptable node pool. I have nodes in each zone (us-central1-b,us-central1-c,us-central1-f). For the last 10 hours, I get the following error for the underlying node vm:
Instance '[instance-name]' creation failed: The zone
'[instance-zone]'
does not have enough resources available to fulfill
the request. Try a different zone, or try again
later.
I tried creating new clusters in different regions with different machine types, using HA (multi-zone) settings and I get the same error for every cluster.
I saw an issue on Google Cloud Status Dashboard and tried with the console, as recommended, and it errors out with a timeout error.
Is anyone else having this problem? Any idea what I may be dong wrong?
UPDATES
Nov 11
I stood up a cluster in us-west2, this was the only one which would work. I used gcloud command line, it seems the UI was not effective. There was a note similar to this situation, use gcloud not ui, on the Google Cloud Status Dashboard.
I tried creating node pools in us-central1 with the gcloud command line, and ui, to no avail.
I'm now federating deployments across regions and standing up multi-region ingress.
Nov. 12
Cannot create HA clusters in us-central1; same message as listed above.
Reached out via twitter and received a response.
Working with the K8s guide to federation to see if I can get multi-cluster running. Most likely going to use Kelsey Hightowers approach
Only problem, can't spin up clusters to federate.
Findings
Talked with google support, need a $150/mo. package to get a tech person to answer my questions.
Preemptible instances are not a good option for a primary node pool. I did this because I'm cheap, it bit me hard.
The new architecture is a primary node pool with committed use VMs that do not autoscale, and a secondary node pool with preemptible instances for autoscale needs. The secondary pool will have minimum nodes = 0 and max nodes = 5 (for right now); this cluster is regional so instances are across all zones.
Cost for an n1-standard-1 sustained use (assuming 24/7) a 30% discount off list.
Cost for a 1-year n1-standard-1 committed use is about ~37% discount off list.
Preemptible instances are re-provisioned every 24hrs., if they are not taken from you when resource needs spike in the region.
I believe I fell prey to a resource spike in the us-central1.
A must-watch for people looking to federate K8s: Kelsey Hightower - CNCF Keynote | Kubernetes Federation
Issue appears to be resolved as of Nov 13th.

GKE Cluster Autoscaler pre creates nodes on time base

I'm running a Kubernetes cluster on Google Cloud. Master version is 1.8.9-gke.1, nodes versions are the same, cluster autoscaler is enabled.
I started to notice that nodes are being created without any pending pods. Those nodes can then sit without running pods for 20-30 minutes, then workload will be allocated to them. This is usually happening before peak hours and looks like cluster is trying to predict load base on load in the past.
Is it something that Google Cloud Cluster Autoscaler managing or is it Kubernetes feature? Is it configurable? I was trying to find any clue in the documentation, but without luck.
It sounds like something the Cluster Autoscaler would do.
Go to Stackdriver Logging and query with advanced filter:
resource.type="k8s_cluster"
resource.labels.cluster_name="<your_cluster_name>"
resource.labels.location="<your_cluster_zone>"
protoPayload.methodName="io.k8s.core.v1.nodes.update"
(The last line alone might suffice). I think this should be the way to get the autoscaler logs. If this does not work, let me know.

Google Container Engine New cluster appears to have failed

I tried to create a new cluster in Container Engine in the Google Developers Console.
It finished pretty quickly with a yellow triangle with an exclamation point. I'm assuming that means it didn't work.
Any idea what I could be doing wrong?
There's a few things that could go wrong. The best option to figure out what's wrong in your situation is to try using the gcloud command line tool, which gives better error information. Information about how to install and use it is in Container Engine's documentation.
Other than the default network being removed (as mentioned by Robert Bailey), you may be trying to create more VM instances than you have quota for. You can check what your quota is on the developer console under Compute > Compute Engine > Quota. You're most likely to go over quota on either CPUs or in-use IP addresses, since each VM created is given an ephemeral IP address.
Have you deleted your default network?
The alpha version of Container Engine relies on the default network when creating VMs and routes between the nodes and you will see an error creating a cluster if you have deleted the default network.