Change node machine type on GKE cluster - kubernetes

I have a GKE cluster I'm trying to switch the default node machine type on.
I have already tried:
Creating a new node pool with the machine type I want
Deleting the default-pool. GKE will process for a bit, then not remove the default-pool. I assume this is some undocumented behavior where you cannot delete the default-pool.
I'd prefer to not re-create the cluster and re-apply all of my deployments/secrets/configs/etc.
k8s version: 1.14.10-gke.24 (Stable channel)
Cluster Type: Regional

The best approach to change/increase/decrease your node pool specification would be with:
Migration
To migrate your workloads without incurring downtime, you need to:
Create a new node pool.
Mark the existing node pool as unschedulable.
Drain the workloads running on the existing node pool.
Check if the workload is running correctly on a new node pool.
Delete the existing node pool.
Your workload will be scheduled automatically onto a new node pool.
Kubernetes, which is the cluster orchestration system of GKE clusters, automatically reschedules the evicted Pods to the new node pool as it drains the existing node pool.
There is official documentation about migrating your workload:
This tutorial demonstrates how to migrate workloads running on a GKE cluster to a new set of nodes within the same cluster without incurring downtime for your application. Such a migration can be useful if you want to migrate your workloads to nodes with a different machine type.
-- GKE: Migrating workloads to different machine types
Please take a look at above guide and let me know if you have any questions in that topic.

Disable the default-pool's autoscaler and set the pool size to 0 nodes.
Wish there was a way I could just switch the machine type on the default-pool...

Related

Prevent GCP maintenance from restarting GKE cluster

Seems like every week the GKE cluster gets restarted. Is there anything I could do to prevent that from happening? It does migrate pods to other node while it does maintenance on one of the node. But I'm not sure if there is downtime during migration and also sometimes the pods gets stuck in crash crashloopbackoff or errimagepull state.
How does the migration happen while maintenance? Does it create a new pod and then route the traffic and then delete the old pod when the total number of replica is just one? Just wanted to know if there is downtime. Its a new cluster and monitoring hasn't been setup so don't know if players are experiencing downtime during maintenance.
Is there a way to prevent GCP from doing maintenance? I used terraform to create the cluster so if I could prevent it I need to do it via terraform since GKE nodes can't be edited using GCP console.
You can configure your maintenance windows and enable/disable automatic node upgrades.
Here's an example of the configuration options in the GCP console:
You can also decide on which release channel you want to be (rapid, regular and stable).
Your Kubernetes control plane will have downtime if you have a zonal cluster. Only regional clusters replicate the control plane.
In terms of your own applications they should have zero downtime and GKE will automatically create new nodes and divert traffic when pods are ready to receive traffic.

How to autoscale node pools in Oracle's Container Engine for Kubernetes?

I'm struggling to find a way how to automatically scale the number of nodes in a Kubernetes node pool in Oracle's offering. This is something I've successfully used on GKE or AKS to only have GPU-enabled nodes running when needed, which is a massive cost saving.
The documentation only reveals scaling in Instance Pools, which is an Oracle-specific tech not related to Kubernetes.
Is node pool scaling possible in Oracle's Container Engine for Kubernetes?
With regards to autoscaling worker nodes in Kubernetes, most clouds use the open source cluster-autoscaler project.
OCI is in the process of integrating with this project to allow node pool autoscaling, and this feature is expected to be generally available in February or March 2021.

GKE cluster upgrade by switching to a new pool: will inter-cluster service communication fail?

From this article(https://cloudplatform.googleblog.com/2018/06/Kubernetes-best-practices-upgrading-your-clusters-with-zero-downtime.html) I learnt that it is possible to create a new node pool, and cordon and drain old nodes one by one, so that workloads get re-scheduled to new nodes in the new pool.
To me, a new node pool seems to indicate a new cluster. The reason: we have two node pools in GKE, and they're listed as two separate clusters.
My question is: after the pods under a service get moved to a new node, if that service is being called from other pods in the old node, will this inter-cluster service call fail?
You don't create a new cluster per se. You upgrade the master(s) and then you create a new node pool with nodes that have a newer version. Make sure the new node pool shares the same network as the original node pool.
If you have a service with one replica (one pod) if that pod is living in one of the nodes you are upgrading you need to allow time for Kubernetes to create a new replica on a different node that is not being upgraded. For that time, your service will be unavailable.
If you have a service with multiple replicas chances are that you won't see any downtime unless for some odd reason all your replicas are scheduled on the same node.
Recommendation: scale your resources which serve your services (Deployments, DaemonSets, StatefulSets, etc) by one or two replicas before doing node upgrades.
StatefulSet tip: You will have some write downtime if you are running something like mysql in a master-slave config when you reschedule your mysql master.
Note that creating a new node Pool does not create a new cluster. You can have multiple node pools within the same cluster. Workloads within the different node pools will still interact with each other since they are in the same cluster.
gcloud container node-pools create (the command to create node pools) requires that you specify the --cluster flag so that the new node pool is created within an existing cluster.
So to answer the question directly, following the steps from that Google link will not cause any service interruption nor will there be any issues with pods from the same cluster communicating with each other during your migration.

Kubernetes Horizontal Pod Autoscaler not utilising node resources

I am currently running Kubernetes 1.9.7 and successfully using the Cluster Autoscaler and multiple Horizontal Pod Autoscalers.
However, I recently started noticing the HPA would favour newer pods when scaling down replicas.
For example, I have 1 replica of service A running on a node alongside several other services. This node has plenty of available resource. During load, the target CPU utilisation for service A rose above the configured threshold, therefore the HPA decided to scale it to 2 replicas. As there were no other nodes available, the CAS span up a new node on which the new replica was successfully scheduled - so far so good!
The problem is, when the target CPU utilisation drops back below the configured threshold, the HPA decides to scale down to 1 replica. I would expect to see the new replica on the new node removed, therefore enabling the CAS to turn off that new node. However, the HPA removed the existing service A replica that was running on the node with plenty of available resources. This means I now have service A running on a new node, by itself, that can't be removed by the CAS even though there is plenty of room for service A to be scheduled on the existing node.
Is this a problem with the HPA or the Kubernetes scheduler? Service A has now been running on the new node for 48 hours and still hasn't been rescheduled despite there being more than enough resources on the existing node.
After scouring through my cluster configuration, I managed to come to a conclusion as to why this was happening.
Service A was configured to run on a public subnet and the new node created by the CA was public. The existing node running the original replica of Service A was private, therefore leading the HPA to remove this replica.
I'm not sure how Service A was scheduled onto this node in the first place, but that is a different issue.

How to assign kubernetes master components to a specific node pool?

For my own deployments I can use node selectors to make them run on a specific node pool. I want to do the same thing for kubernetes master pods.
(Motivation: I upgraded the kubernetes cluster today causing the aforementioned pods to be moved to some random node pool which I want to prevent from happening again in the future.)