ERROR: failed to create cluster while running kind create cluster - kubernetes

When I run kind create cluster in Ubuntu 20.04 I get this error:
Creating cluster "kind" ...
✓ Ensuring node image (kindest/node:v1.21.1) đŸ–ŧ
✓ Preparing nodes đŸ“Ļ
✓ Writing configuration 📜
✗ Starting control-plane 🕹ī¸ k
ERROR: failed to create cluster: failed to init node with kubeadm: command "docker exec --privileged kind-control-plane kubeadm init --skip-phases=preflight --config=/kind/kubeadm.conf --skip-token-print --v=6" failed with error: exit status 1
Complete logs: https://paste.debian.net/1207493/
What can be the reason for this? I cannot find any relevant solution in the docs or existing github issues.

I had the same problem. After adjusting the docker memory allocation, the cluster was created successfully

Related

How to connect to Dask Jupyter notebook running in minikube from internet?

I have a dask cluster running in minikube in a remote VM (Oracle Linux, 64GB, 8 core). The VM connects through proxy to external networks.
I want to access the Jupyter notebook provided by Dask from my browser running in my local Mac.
I would like to understand what are the options available for me to setup this connection.
Here is what I tried:
minikube start --driver=docker --base-image="gcr.io/k8s-minikube/kicbase:v0.0.29" --memory 32768 --cpus 6
minikube tunnel
helm install mydask dask/dask --set scheduler.serviceType=LoadBalancer --set jupyter.serviceType=LoadBalancer
While this does provide external-ip to the 'mydask-jupyter' service, the IP is not in the same subnet as my VM. Therefore it is not publicly accessible.
Next I tried starting minikube as below:
minikube start --driver=none
However running into other errors:
đŸ’ĸ initialization failed, will try again: wait: /bin/bash -c "sudo env PATH="/var/lib/minikube/binaries/v1.23.1:$PATH" kubeadm init --config /var/tmp/minikube/kubeadm.yaml --ignore-preflight-errors=DirAvailable--etc-kubernetes-manifests,DirAvailable--var-lib-minikube,DirAvailable--var-lib-minikube-etcd,FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml,FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml,FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml,FileAvailable--etc-kubernetes-manifests-etcd.yaml,Port-10250,Swap,Mem": exit status 1
stdout:
[init] Using Kubernetes version: v1.23.1
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
stderr:
[WARNING Firewalld]: firewalld is active, please ensure ports [8443 10250] are open or your cluster may not function correctly
[WARNING Swap]: swap is enabled; production deployments should disable swap unless testing the NodeSwap feature gate of the kubelet
[WARNING Hostname]: hostname "my-dask" could not be reached
[WARNING Hostname]: hostname "my-dask": lookup my-dask on <<IP>> no such host
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR ImagePull]: failed to pull image k8s.gcr.io/kube-apiserver:v1.23.1: output: Trying to pull repository k8s.gcr.io/kube-apiserver ...
Get "https://k8s.gcr.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
, error: exit status 1
...
My Questions:
What configurations are required to fix this?
Is there a better alternative to minikube that will be convenient for this use-case?
Thank you.

Error: error inspecting object: no such container minikube

I am trying to run minikube on ubuntu 18.04 version. getting an error while starting minikube. Please help. Tried minikube delete and start again but dosent work
Aspire-E5-573G:~$ minikube start --driver=podman --container-runtime=cri-o
😄 minikube v1.13.0 on Ubuntu 18.04
❗ Using podman 2 is not supported yet. your version is "2.0.6". minikube might not work. use at your own risk.
✨ Using the podman (experimental) driver based on existing profile
👍 Starting control plane node minikube in cluster minikube
💾 Downloading Kubernetes v1.19.0 preload ...
> preloaded-images-k8s-v6-v1.19.0-cri-o-overlay-amd64.tar.lz4: 551.13 MiB /
🔄 Restarting existing podman container for "minikube" ...
đŸ¤Ļ StartHost failed, but will try again: podman inspect ip minikube: sudo -n podman container inspect -f {{.NetworkSettings.IPAddress}} minikube: exit status 125
stdout:
stderr:
Error: error inspecting object: no such container minikube
🔄 Restarting existing podman container for "minikube" ...
đŸ˜ŋ Failed to start podman container. Running "minikube delete" may fix it: podman inspect ip minikube: sudo -n podman container inspect -f {{.NetworkSettings.IPAddress}} minikube: exit status 125
stdout:
stderr:
Error: error inspecting object: no such container minikube
❌ Exiting due to GUEST_PROVISION: Failed to start host: podman inspect ip minikube: sudo -n podman container inspect -f minikube: exit status 125
stdout:
stderr:
Error: error inspecting object: no such container minikube
đŸ˜ŋ If the above advice does not help, please let us know:
👉 https://github.com/kubernetes/minikube/issues/new/choose
As the error already indicates podman 2 is not yet supported.
Using podman 2 is not supported yet. your version is "2.0.6". minikube might not work. use at your own risk.
The workaround for this as described here is to use version v.1.9.3.
Here`s the merge that was done to warn about podman version 2.
Yes you are right. I tried using docker as the driver in the arguments and it worked. Podman 2 is not yet supported

How can I rename master nodes in a HA kubernetes cluster?

I have a kubernetes cluster with 3 master nodes. They are named master-1, master-2 and master-3. I would like to rename them as control-plane-n.
I could not find a clear procedure to do this. The closest one is how to rename a node in a cluster. So I just tried that. Here is what I did (my hosts are running ubuntu 18.04, and kubernetes v1.16.2):
On master-1:
kubectl drain master-3 --ignore-daemonsets
kubectl delete node master-3
Run "kubeadm token create --print-join-command" and copy the output
On master-3:
sudo kubeadm reset
sudo hostnamectl set-hostname control-plane-3
Modify /etc/cloud/cloud.cfg to set preserve_hostname to true
Reboot the VM
Paste in the join command from master-1, with --control-plane option added
Here is the log I got:
sudo kubeadm join 172.22.19.188:6443 --control-plane --token nxxzby.zsfdx86e7cv1rq0e --discovery-token-ca-cert-hash sha256:553366c2f91fd3abffe3e3d1c39d9314e2d73e8a6181f4da9938a8e24fd77456
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks before initializing the new control plane instance
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/data/kubernetes/pki"
error execution phase control-plane-prepare/certs: error creating PKI assets: failed to write or validate certificate "apiserver": certificate apiserver is invalid: x509: certificate is valid for master-3, kubernetes, kubernetes.default, kubernetes.default.svc, kubernetes.default.svc.cluster.local, not control-plane-3
To see the stack trace of this error execute with --v=5 or higher
How can I proceed? Or is there a better approach?
Thanks in advance for any idea or suggestion you can offer.
Based on #zerkms comment, you can create a 4th node with a proper name, join, then remove one of the old from the cluster.
Doing this 3 times you will be able to have all node with the desired name.

Minikube won't start on mac

Trying to start minikube on mac. Virtualization is being provided by VirtualBox.
$ minikube start
😄 minikube v1.1.0 on darwin (amd64)
đŸ”Ĩ Creating virtualbox VM (CPUs=2, Memory=2048MB, Disk=20000MB) ...
đŸŗ Configuring environment for Kubernetes v1.14.2 on Docker 18.09.6
❌ Unable to load cached images: loading cached images: loading image /Users/paul/.minikube/cache/images/k8s.gcr.io/kube-proxy_v1.14.2: Docker load /tmp/kube-proxy_v1.14.2: command failed: docker load -i /tmp/kube-proxy_v1.14.2
stdout:
stderr: open /var/lib/docker/image/overlay2/layerdb/tmp/write-set-542676317/diff: read-only file system
: Process exited with status 1
đŸ’Ŗ Failed to setup certs: pre-copy: command failed: sudo rm -f /var/lib/minikube/certs/ca.crt
stdout:
stderr: rm: cannot remove '/var/lib/minikube/certs/ca.crt': Input/output error
: Process exited with status 1
đŸ˜ŋ Sorry that minikube crashed. If this was unexpected, we would love to hear from you:
👉 https://github.com/kubernetes/minikube/issues/new
Trying minikube delete followed by minikube start produces the same issue.
Docker is running and is signed in.
I also deleted all machines in virtualbox after minikube delete and still get the same result.
According to What if I answer a question in a comment? I am adding answer as well since many people dont read comments.
You can try to delete local config in MINIKUBE_HOME before starting minikube
rm -rf ~/.minikube
Try
minikube delete
and
minikube start

Joining cluster takes forever

I have set up my master node and I am trying to join a worker node as follows:
kubeadm join 192.168.30.1:6443 --token 3czfua.os565d6l3ggpagw7 --discovery-token-ca-cert-hash sha256:3a94ce61080c71d319dbfe3ce69b555027bfe20f4dbe21a9779fd902421b1a63
However the command hangs forever in the following state:
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
Since this is just a warning, why does it actually fails?
edit: I noticed the following in my /var/log/syslog
Mar 29 15:03:15 ubuntu-xenial kubelet[9626]: F0329 15:03:15.353432 9626 server.go:193] failed to load Kubelet config file /var/lib/kubelet/config.yaml, error failed to read kubelet config file "/var/lib/kubelet/config.yaml", error: open /var/lib/kubelet/config.yaml: no such file or directory
Mar 29 15:03:15 ubuntu-xenial systemd[1]: kubelet.service: Main process exited, code=exited, status=255/n/a
Mar 29 15:03:15 ubuntu-xenial systemd[1]: kubelet.service: Unit entered failed state.
First if you want to see more detail when your worker joins to the master use:
kubeadm join 192.168.1.100:6443 --token m3jfbb.wq5m3pt0qo5g3bt9 --discovery-token-ca-cert-hash sha256:d075e5cc111ffd1b97510df9c517c122f1c7edf86b62909446042cc348ef1e0b --v=2
Using the above command I could see that my worker could not established connection with the master, so i just stoped the firewall:
systemctl stop firewalld
This can be solved by creating a new token
using this command:
kubeadm token create --print-join-command
and use the token generated for joining other nodes to the cluster
The problem had to do with kubeadm not installing a networking CNI-compatible solution out of the box;
Therefore, without this step the kubernetes nodes/master are unable to establish any form of communication;
The following task addressed the issue:
- name: kubernetes.yml --> Install Flannel
shell: kubectl -n kube-system apply -f https://raw.githubusercontent.com/coreos/flannel/bc79dd1505b0c8681ece4de4c0d86c5cd2643275/Documentation/kube-flannel.yml
become: yes
environment:
KUBECONFIG: "/etc/kubernetes/admin.conf"
when: inventory_hostname in (groups['masters'] | last)
I did get the same error on CentOS 7 but in my case join command worked without problems, so it was indeed just a warning.
> [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker
> cgroup driver. The recommended driver is "systemd". Please follow the
> guide at https://kubernetes.io/docs/setup/cri/ [preflight] Reading
> configuration from the cluster... [preflight] FYI: You can look at
> this config file with 'kubectl -n kube-system get cm kubeadm-config
> -oyaml' [kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.14" ConfigMap in the kube-system namespace
As the official documentation mentions, there are two common issues that make the init hang (I guess it also applies to join command):
the default cgroup driver configuration for the kubelet differs from
that used by Docker. Check the system log file (e.g. /var/log/message)
or examine the output from journalctl -u kubelet. If you see something
like the following:
First try the steps from official documentation and if that does not work please provide more information so we can troubleshoot further if needed.
I had a bunch of k8s deployment scripts that broke recently with this same error message... it looks like docker changed it's install. Try this --
previous install:
apt-get isntall docker-ce
updated install:
apt-get install docker-ce docker-ce-cli containerd.io
How /var/lib/kubelet/config.yaml is created?
Regarding the /var/lib/kubelet/config.yaml: no such file or directory error.
Below are steps that should occur on the worker node in order for the mentioned file to be created.
1 ) The creation of the /var/lib/kubelet/ folder. It is created when the kubelet service is installed as mentioned here:
sudo apt-get update && sudo apt-get install -y apt-transport-https curl
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
cat <<EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list
deb https://apt.kubernetes.io/ kubernetes-xenial main
EOF
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl
2 ) The creation of config.yaml. The kubeadm join flow should take place so when you run kubeadm join, kubeadm uses the Bootstrap Token credential to perform a TLS bootstrap, which fetches the credential needed to download the kubelet-config-1.X ConfigMap and writes it to /var/lib/kubelet/config.yaml.
After a successful execution you should see the logs below:
.
.
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
.
.
So, after these 2 steps you should have /var/lib/kubelet/config.yaml in place.
Failure of the kubeadm join flow
In your case, it seems that the kubeadm join flow failed which might happen due to multiple reasons like bad configuration of iptables, ports that are already in use, container runtime not installed properly, etc' - as described here and here.
As far as I know, the fact that no networking CNI-compatible solution was in place should not affect the creation of /var/lib/kubelet/config.yaml:
A) We can see the under the kubeadm preflight checks what issues will cause the join phase to fail.
B ) I also tested this issue by removing the current solution I used (Calico) and ran kubeadm reset and kubeadm join again and no errors appeared in the kubeadm logs (I've got the successful execution logs I mentioned above) and /var/lib/kubelet/config.yaml was created properly.
(*) Of course that the cluster can't function in this state - I just wanted to emphasize that I think the problem was one of the options mentioned in A.