Change Container Runtime without destroying cluster - kubernetes

we are running multiple kubespray deployed clusters with 10-100 nodes.
with 1.20 kubernetes deperecates dockershim support -> https://github.com/kubernetes/kubernetes/blob/ab32085bf36fc7af1ded30456e2f09399dc1115f/CHANGELOG/CHANGELOG-1.20.md#deprecation
how to change the container runtime to containerd - without removing nodes and without destroying master.

i am not at panick, just wan't to be prepared we are at 1.19 already so 1.22 is not soo faar away.
anyways i tested it with a smaller cluster, and it was way easier as expected.
change: container_manager to containerd.
run the kubespray cluster.yml playbook over all nodes and boom.
only needed to do a simple ansible playbook to uninstall docker et-all, but it also works with docker installed.

Please treat this answer as a friendly advise.
First of all, as suggested in yesterday's fresh article Don't Panic: Kubernetes and Docker:
You do not need to panic :)
Kubernetes is only deprecating Docker as a container runtime after v1.20. They are currently only planning to remove Docker runtime support in the 1.22 release in late 2021(almost year!), so please don't brake your 100 nodes clusters till work solution will appear :)

Related

can i install kubernetes on amazon linux 2

I'm having trouble Installing kubeadm on my amazon linux 2 instance specifically when i try to create a cluster,
when i try Installing runtime i get to chose which one to use :
containerd
CRI-O
Docker Engine
Mirantis Container Runtime
first of all i'm wondering which one i should use between them that is compatible with amazon linux 2 and second of all whenever i run yum install for any CRI i get this same error:
this is the output of the command: yum install cri-o
the doc that i followed is: https://kubernetes.io/docs/setup/production-environment/container-runtimes/
hi, hope you are enjoying your kubernetes Journey !
First off, you I wanna tell you that you can use whichever you want between the container runtime you want to install.
You can use docker if you are not familiar with the others but containerd is in my opinion the best lightweight alternative ( containerd is used in docker, but for kubernetes you don't need all the layers that docker provides only the container runtime Itself, here containerd ) you can read this for more info, but there is plenty of documentation about this.: https://www.tutorialworks.com/difference-docker-containerd-runc-crio-oci/
Second of all, I don't know how you are trying to install your kubernetes cluster but again there is few couples of way to do it:
The hardest but very instructive can be kubernetes the hard way ( https://github.com/kelseyhightower/kubernetes-the-hard-way )
Next you can use kubeadm (again there is plenty of documentation on the internet but you can follow one of the kubeadm tutorials: https://devopscube.com/setup-kubernetes-cluster-kubeadm/ )
Here is a list of tools that you can use to install your kubernetes cluster, you can look for tutorials for each of them on the internet: https://dzone.com/articles/50-useful-kubernetes-tools )
Last but not least, since you are on aws, you can use the AWS EKS service to setup quickly a robust kubernetes cluster. (https://aws.amazon.com/fr/eks/)
This is for AWS. If you want a local k8s cluster I strongly suggest you to use kind (kubernetes in docker)
Bguess

Specify containerd version on minikube or kind

I am trying to reproduce an issue that requires me to use containerd v1.4.4 for my container-runtime and kubernetes v1.19.8. When I try to use minikube to create a multi-node cluster locally, it allows me to specify the kubernetes version but I am unable to specify the containerd version(i.e. it always uses v1.4.9) and based on this github discussion, it doesn't seem to support it. I then turned to kind but was unable to find a way to specify the same from the documentation. Is there a way either in kind or in minikube to specify the containerd version?
I ended up using kubeadm and set up a master and worker node using 2 VMs. This allowed me to specify the versions I want on the worker node. Building a base image on kind should also work as user Mikolaj.S mentioned

Kubectl connection refused existing cluster

Hope someone can help me.
To describe the situation in short, I have a self managed k8s cluster, running on 3 machines (1 master, 2 worker nodes). In order to make it HA, I attempted to add a second master to the cluster.
After some failed attempts, I found out that I needed to add controlPlaneEndpoint configuration to kubeadm-config config map. So I did, with masternodeHostname:6443.
I generated the certificate and join command for the second master, and after running it on the second master machine, it failed with
error execution phase control-plane-join/etcd: error creating local etcd static pod manifest file: timeout waiting for etcd cluster to be available
Checking the first master now, I get connection refused for the IP on port 6443. So I cannot run any kubectl commands.
Tried recreating the .kube folder, with all the config copied there, no luck.
Restarted kubelet, docker.
The containers running on the cluster seem ok, but I am locked out of any cluster configuration (dashboard is down, kubectl commands not working).
Is there any way I make it work again? Not losing any of the configuration or the deployments already present?
Thanks! Sorry if it’s a noob question.
Cluster information:
Kubernetes version: 1.15.3
Cloud being used: (put bare-metal if not on a public cloud) bare-metal
Installation method: kubeadm
Host OS: RHEL 7
CNI and version: weave 0.3.0
CRI and version: containerd 1.2.6
This is an old, known problem with Kubernetes 1.15 [1,2].
It is caused by short etcd timeout period. As far as I'm aware it is a hard coded value in source, and cannot be changed (feature request to make it configurable is open for version 1.22).
Your best bet would be to upgrade to a newer version, and recreate your cluster.

gpu worker node unable to join cluster

I've a EKS setup (v1.16) with 2 ASG: one for compute ("c5.9xlarge") and the other gpu ("p3.2xlarge").
Both are configured as Spot and set with desiredCapacity 0.
K8S CA works as expected and scale out each ASG when necessary, the issue is that the newly created gpu instance is not recognized by the master and running kubectl get nodes emits nothing.
I can see that the ec2 instance was in Running state and also I could ssh the machine.
I double checked the the labels and tags and compared them to the "compute".
Both are configured almost similarly, the only difference is that the gpu nodegroup has few additional tags.
Since I'm using eksctl tool (v.0.35.0) and the compute nodeGroup vs. gpu nodeGroup is basically copy&paste, I can't figured out what could be the problem.
UPDATE:
ssh the instance I could see the following error (/var/log/messages)
failed to run Kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"
and the kubelet service crashed.
would it possible the my GPU uses wrong AMI (amazon-eks-gpu-node-1.18-v20201211)?
As a simple you can use this preBootstrapCommands in eksctl yaml config file:
- name: test-node-group
preBootstrapCommands:
- "sed -i 's/cgroupDriver:.*/cgroupDriver: cgroupfs/' /etc/eksctl/kubelet.yaml"
There is some issue with EKS 1.16, even the graviton processors machine won't join the cluster. To fix it first you try upgrading your CNI version. Please refer the documentation here:
https://docs.aws.amazon.com/eks/latest/userguide/cni-upgrades.html
And if that doesn't work, then upgrade your EKS version to the latest available version then should work.
I've found out the issue. It seems to be mis-alignment between eksctl (v0.35.0) and the AL2-GPU AMI.
AWS team change the control group in docker to be "systemd" instead of "cgroup" (github) while the eksctl tool I used didn't absorb the changes.
A temporary solution is to edit the /etc/eksctl/kubelet.yaml file using preBootstrapCommands

How to install Kubernetes v1.10.11 on a GCP cluster?

There was recently a Kubernetes security hole that was patched in v1.10.11 (among other versions), so I would like to upgrade to that version. I am currently on v1.10.9. However, when running the command gcloud container get-server-config to get the list of valid node versions, v1.10.11 doesn't show up. Instead, it jumps straight from v1.10.9 to v1.11.2.
Does anyone have any idea why I cannot seem to use the usual gcloud container clusters upgrade [CLUSTER_NAME] --cluster-version [CLUSTER_VERSION] to upgrade to this version?
Thanks in advance!
Based on:
https://cloud.google.com/kubernetes-engine/docs/security-bulletins#december-3-2018
If you have Kubernetes in v1.10.9 you should (to patch this security hole) update your GKE Cluster to 1.10.9-gke.5.
The following Kubernetes versions are now available for new clusters and for opt-in master upgrades for existing clusters:
1.9.7-gke.11,
1.10.6-gke.11,
1.10.7-gke.11,
1.10.9-gke.5,
1.11.2-gke.18
Please validate your Scheduled master auto-upgrades option in GKE.
If it's enabled your cluster masters were auto-upgraded by Google and the next possible version to update is further version so v1.11.2, what is showing by GKE for you.