terraform: how to upgrade Azure AKS default nodepool VM size without replacement of the cluster - kubernetes

I'm trying to upgrade the VM size of my AKS cluster using this approach with Terraform. Basically I create a new nodepool with the required amount of nodes, then I cordon the old node to disallow scheduling of new pods. Next, I drain the old node to reschedule all the pods in the newly created node. Then, I proceed to upgrade the VM size.
The problem I am facing is that azurerm_kubernetes_cluster resource allow for the creation of the default_node_pool and another resource, azurerm_kuberentes_cluster_node_pool allows me to create new nodepools with extra nodes.
Everything works until I create the new nodepool, cordon and drain the old one. However when I change the default_nod_pool.vm_size and run terraform apply, it tells me that the whole resource has to be recreated, including the new nodepool I just created, because it's linked to the cluster id, which will be replaced.
How should I manage this upgrade from the documentation with Terraform if upgrading the default node pool always forces replacement even if a new nodepool is in place?
terraform version
terraform v1.1.7
on linux_amd64
+ provider registry.terraform.io/hashicorp/azurerm v2.82.0
+ provider registry.terraform.io/hashicorp/local v2.2.2

Related

Assign Gitlab Runner daemon's pod and the jobs's pods to two separate node groups in Kubernetes when using Kubernetes executor

We're using Gitlab Runner with Kubernetes executor and we were thinking about what I think is currently not possible. We want to assign the Gitlab Runner daemon's pod to a specific node group's worker with instance type X and the jobs' pods to a different node group Y worker nodes as these usually require more computation resources than the Gitlab Runner's pod.
This comes in order to save costs, as the node where the Gitlab runner main daemon will always be running, then we want it to be running on a cheap instance, and later the jobs which need more computation capacity then they can run on different instances with different type and which will be started by the Cluster Autoscaler and later destroyed when no jobs are present.
I made an investigation about this feature, and the available way to assign the pods to specific nodes is to use the node selector or node affinity, but the rules included in these two configuration sections are applied to all the pods of the Gitlab Runner, the main pod and the jobs pods. The proposal is to make it possible to apply two separate configurations, one for the Gitlab Runner's pod and one for the jobs' pods.
The current existing config consists of the node selector and nodes/pods affinity, but as I mentioned these apply globally to all the pods and not to specified ones as we want in our case.
Gitlab Runner Kubernetes Executor Config: https://docs.gitlab.com/runner/executors/kubernetes.html
This problem is solved! After a further investigation I found that Gitlab Runner's Helm chart provide 2 nodeSelector features, to exactly do what I was looking for, 1 for the main pod which represents the Gitlab Runner pod and the other one for the Gitlab Runner's jobs pods. Below I show a sample of the Helm chart in which I set beside each nodeSelector its domain and the pod that it affects.
Note that the first level nodeSelector is the one that affects the main Gitlab Runner pod, and the runners.kubernetes.node_selector is the one that affects the Gitlab Runner's jobs pods.
gitlabUrl: https://gitlab.com/
...
nodeSelector:
gitlab-runner-label-example: label-values-example-0
...
runnerRegistrationToken: ****
...
runners:
config:
[[runners]]
name = "gitlabRunnerExample"
executor = "kubernetes"
environment = ["FF_USE_LEGACY_KUBERNETES_EXECUTION_STRATEGY=true"]
[runners.kubernetes]
...
[runners.kubernetes.node_selector]
"gitlab-runner-label-example" = "label-values-example-1"
[runners.cache]
...
[runners.cache.s3]
...
...
using the helm chart, there is an additional configuration part where you can specify, well, additional configuration
one of the them is especially the node selector for jobs pods and another one for toleration.
The combination of that and some namespace level config should allow you to run the 2 kinds of pod on different node types

Automate GKE Node migration using terraform

I am using terraform to provision gke clusters. Currently if we have to modify the node pool properties we do it manually (create new nodepool, cordon old nodepool, drain nodes).
Is there a way i can automate it using terraform itself? currently if i change any properties it destroys and recreates the entire node.
You can use the
lifecycle {
create_before_destroy = true
}
in terraform so it will create the Node pool first and then stop the old instance or node pool.
Read more about it : https://www.terraform.io/language/meta-arguments/lifecycle

Migrating from the Docker runtime to the Containerd runtime on a specific NODE kubernetes

I would like to know if it's possible to change the param --image-type COS_CONTAINERD in the google cloud command line but on a specific node.
I found this command below :
gcloud container clusters upgrade CLUSTER_NAME --image-type COS_CONTAINERD [--node-pool POOL_NAME]
But it will update my entire node pool so it'will create a disturbance on my services web .
Every node in my pool will be updated in the same time ( destroy & re-create ).
Or there is another way ???
I would like to know if you have the same topic / issue . It' will help me during this migration.
Thanks for your experience .
The command you are mentioning, will update all the nodes within the node pool.
I would suggest you trying to create a new node pool on your cluster and choose the COS_CONTAINERD image type, and once you have the new node pool with cos-containerd migrate your workload to the new node pool node(s) by following this process. This will also help you to manage the downtime.

Kubernetes V1.16.8 doesn't support 'node-role' label using "--node-labels=node-role.kubernetes.io/master="

Upgrade Kube-aws v1.15.5 cluster to the next version 1.16.8.
Use Case:
I want to keep the Same node label for Master and Worker nodes as I'm using in v1.15 .
When I tried to upgrade the cluster to V1.16 the --node-labels is restricted to use 'node-role'
If I keep the node role as "node-role.kubernetes.io/master" the kubelet fails to start after upgrade. if I remove the label, kubectl get node output shows none for the upgraded node.
How do I reproduce?
Before the upgrade I took a backup of 'cp /etc/sysconfig/kubelet /etc/sysconfig/kubelet-bkup' have removed "-role" from it and once the upgrade is completed, I have moved the kubelet sysconfig by replacing the edited file 'mv /etc/sysconfig/kubelet-bkup /etc/sysconfig/kubelet'. Now I could able to see the Noderole as Master/Worker even after kubelet service restart.
The Problem I'm facing now?
Though I perform the upgrade on the existing cluster successfully. The cluster is running in AWS as Kube-aws model. So, the ASG would spin up a new node whenever Cluster-Autoscaler triggers it.
But, the new node fails to join to the cluster since the node label "node-role.kubernetes.io/master" exists in the code base.
How can I add the node-role dynamically in the ASG scale-in process?. Any solution would be appreciated.
Note:
(Kubeadm, kubelet, kubectl )- v1.16.8
I have sorted out the issue. I have created a Python code that watches the node events. So whenever ASG spins up a new node, after it joins to the cluster, the node wil be having a role "" , later the python code will add a appropriate label to the node dynamically.
Also, I have created a docker image with the base of python script I created for node-label and it will run as a pod. The pod will be deployed into the cluster and it does the job of labelling the new nodes.
Ref my solution given in GitHub
https://github.com/kubernetes/kubernetes/issues/91664
I have created as a docker image and it is publicly available
https://hub.docker.com/r/shaikjaffer/node-watcher
Thanks,
Jaffer

RouteController failed to create a route on GKE

I have a cluster on GKE whose node pool I create when I want to use the cluster, and delete when I'm done with it.
It's a two node cluster with the master in europe-west2-a and with and whose node zones are europe-west2-a and europe-west2-b.
The most recent creation resulted in the node in zone B failing with NetworkUnavailable because RouteController failed to create a route. The reason was because Could not create route xxx 10.244.1.0/24 for node xxx after 342.263706ms: instance not found.
Why would this be happening all of a sudden, and what can I do to fix it?!
You didn't mention which version of GKE you are using so just for clarification:
Changes in access scopes
Beginning with Kubernetes version 1.10, gcloud and GCP Console no longer grants the compute-rw access scope on new clusters and new node pools by default. Furthermore, if --scopes is specified in gcloud container create, gcloud no longer silently adds compute-rw or storage-ro.
In any case you can still revert to legacy access scopes but this is not recommended approach.
Hope this help.
With gke 1.13.6-gke.13, some of the default scopes were changed, including the compute-rw scope being removed. I think that due to the age of the cluster, this scope was necessary for a route to be correctly created between nodes in a node pool.
In the end, my gcloud creation command had these scopes:
--scopes https://www.googleapis.com/auth/projecthosting,storage-rw,monitoring,trace,compute-rw