I'm trying to run k3s in rootless-mode. For now, i've done common steps from https://rootlesscontaine.rs/getting-started and used unit-file from https://github.com/k3s-io/k3s/blob/master/k3s-rootless.service
Systemd service k3s-rootless.service is active and run, but the pods are constantly in pending status.
I get these messages:
jun 21 20:43:58 k3s-tspd.local k3s[1065]: E0621 20:43:58.647601 33 controller.go:116] loading OpenAPI spec for "v1beta1.metrics.k8s.io" failed with: failed to retrieve openAPI spec, http error: ResponseCode: 503, Body: service unavailable
jun 21 20:43:58 k3s-tspd.local k3s[1065]: , Header: map[Content-Type:[text/plain; charset=utf-8] X-Content-Type-Options:[nosniff]]
jun 21 20:43:58 k3s-tspd.local k3s[1065]: I0621 20:43:58.647876 33 controller.go:129] OpenAPI AggregationController: action for item v1beta1.metrics.k8s.io: Rate Limited Requeue.
jun 21 20:43:59 k3s-tspd.local k3s[1065]: time="2022-06-21T20:43:59Z" level=info msg="Waiting for control-plane node k3s-tspd.local startup: nodes \"k3s-tspd.local\" not found"
jun 21 20:44:00 k3s-tspd.local k3s[1065]: time="2022-06-21T20:44:00Z" level=info msg="Waiting for control-plane node k3s-tspd.local startup: nodes \"k3s-tspd.local\" not found"
jun 21 20:44:00 k3s-tspd.local k3s[1065]: time="2022-06-21T20:44:00Z" level=info msg="certificate CN=k3s-tspd.local signed by CN=k3s-server-ca#1655821591: notBefore=2022-06-21 14:26:31 +0000 UTC notAfter=2023-06-21 20:44:00 +0000 UTC"
jun 21 20:44:00 k3s-tspd.local k3s[1065]: time="2022-06-21T20:44:00Z" level=info msg="certificate CN=system:node:k3s-tspd.local,O=system:nodes signed by CN=k3s-client-ca#1655821591: notBefore=2022-06-21 14:26:31 +0000 UTC notAfter=2023-06-21 20:44:00 +0000 UTC"
jun 21 20:44:00 k3s-tspd.local k3s[1065]: time="2022-06-21T20:44:00Z" level=info msg="Waiting to retrieve agent configuration; server is not ready: \"fuse-overlayfs\" snapshotter cannot be enabled for \"/home/scadauser/.rancher/k3s/agent/containerd\", try using \"native\": fuse-overlayfs not functional, make sure running with kernel >= 4.18: failed to mount fuse-overlayfs ({Type:fuse3.fuse-overlayfs Source:overlay Options:[lowerdir=/home/scadauser/.rancher/k3s/agent/containerd/fuseoverlayfs-check751772682/lower2:/home/scadauser/.rancher/k3s/agent/containerd/fuseoverlayfs-check751772682/lower1]}) on /home/scadauser/.rancher/k3s/agent/containerd/fuseoverlayfs-check751772682/merged: mount helper [mount.fuse3 [overlay /home/scadauser/.rancher/k3s/agent/containerd/fuseoverlayfs-check751772682/merged -o lowerdir=/home/scadauser/.rancher/k3s/agent/containerd/fuseoverlayfs-check751772682/lower2:/home/scadauser/.rancher/k3s/agent/containerd/fuseoverlayfs-check751772682/lower1 -t fuse-overlayfs]] failed: \"\": exec: \"mount.fuse3\": executable file not found in $PATH"
jun 21 20:44:01 k3s-tspd.local k3s[1065]: time="2022-06-21T20:44:01Z" level=info msg="Waiting for control-plane node k3s-tspd.local startup: nodes \"k3s-tspd.local\" not found"
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system helm-install-traefik-hn2nn 0/1 Pending 0 5h5m
kube-system helm-install-traefik-crd-djr4j 0/1 Pending 0 5h5m
kube-system local-path-provisioner-6c79684f77-w7fjb 0/1 Pending 0 5h5m
kube-system metrics-server-7cd5fcb6b7-rlctn 0/1 Pending 0 5h5m
kube-system coredns-d76bd69b-mjj4m 0/1 Pending 0 15m
What should i do next?
The solution was quite obvious.
In unit file k3s-rootless.service I used the wrong snapshotter. For containerd in k3s rootless-mode it has to be '--snapshotter=fuse-overlayfs'.
fuse-overlayf also need to be installed before run k3s in rootless-mode.
Related
I've been trying to troubleshoot my kubernetes cluster since my master node is NOT Ready. I've followed guides on StackOverflow and Kubernetes troubleshooting guide but I am not able to pinpoint the issue. I'm relatively new to kubernetes.
Here's what I have tried:
#kubectl get nodes
NAME STATUS ROLES AGE VERSION
NodeName NotReady master 213d v1.16.2
#kubectl describe node NodeName
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
NetworkUnavailable False Tue, 12 Jan 2021 15:41:24 +0530 Tue, 12 Jan 2021 15:41:24 +0530 CalicoIsUp Calico is running on this node
MemoryPressure Unknown Fri, 15 Jan 2021 16:40:54 +0530 Fri, 15 Jan 2021 16:48:07 +0530 NodeStatusUnknown Kubelet stopped posting node status.
DiskPressure Unknown Fri, 15 Jan 2021 16:40:54 +0530 Fri, 15 Jan 2021 16:48:07 +0530 NodeStatusUnknown Kubelet stopped posting node status.
PIDPressure Unknown Fri, 15 Jan 2021 16:40:54 +0530 Fri, 15 Jan 2021 16:48:07 +0530 NodeStatusUnknown Kubelet stopped posting node status.
Ready Unknown Fri, 15 Jan 2021 16:40:54 +0530 Fri, 15 Jan 2021 16:48:07 +0530 NodeStatusUnknown Kubelet stopped posting node status.
# sudo journalctl -u kubelet -n 100 --no-pager
Feb 26 12:23:03 devportal-test kubelet[11311]: E0226 12:23:03.581359 11311 kubelet.go:2267] node "devportal-test" not found
Feb 26 12:23:03 devportal-test kubelet[11311]: E0226 12:23:03.681814 11311 kubelet.go:2267] node "devportal-test" not found
Feb 26 12:23:03 devportal-test kubelet[11311]: E0226 12:23:03.782649 11311 kubelet.go:2267] node "devportal-test" not found
Feb 26 12:23:03 devportal-test kubelet[11311]: E0226 12:23:03.883846 11311 kubelet.go:2267] node "devportal-test" not found
Feb 26 12:23:03 devportal-test kubelet[11311]: I0226 12:23:03.912585 11311 kubelet_node_status.go:286] Setting node annotation to enable volume controller attach/detach
Feb 26 12:23:03 devportal-test kubelet[11311]: I0226 12:23:03.918664 11311 kubelet_node_status.go:72] Attempting to register node devportal-test
Feb 26 12:23:03 devportal-test kubelet[11311]: E0226 12:23:03.926545 11311 kubelet_node_status.go:94] Unable to register node "devportal-test" with API server: nodes "devportal-test" is forbidden: node "NodeName" is not allowed to modify node "devportal-test"
Feb 26 12:23:05 devportal-test kubelet[11311]: E0226 12:23:05.893160 11311 kubelet.go:2267] node "devportal-test" not found
Feb 26 12:23:05 devportal-test kubelet[11311]: E0226 12:23:05.993770 11311 kubelet.go:2267] node "devportal-test" not found
Feb 26 12:23:06 devportal-test kubelet[11311]: E0226 12:23:06.095640 11311 kubelet.go:2267] node "devportal-test" not found
Feb 26 12:23:06 devportal-test kubelet[11311]: E0226 12:23:06.147085 11311 controller.go:135] failed to ensure node lease exists, will retry in 7s, error: leases.coordination.k8s.io "devportal-test" is forbidden: User "system:node:NodeName" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "kube-node-lease": can only access node lease with the same name as the requesting node
# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-6d85fdfbd8-dkxjx 1/1 Terminating 8 213d
calico-kube-controllers-6d85fdfbd8-jsxjd 0/1 Pending 0 28d
calico-node-v5w2w 1/1 Running 8 213d
coredns-5644d7b6d9-g8rnl 1/1 Terminating 16 213d
coredns-5644d7b6d9-vgzg2 0/1 Pending 0 28d
coredns-5644d7b6d9-z8dzw 1/1 Terminating 16 213d
coredns-5644d7b6d9-zmcjr 0/1 Pending 0 28d
etcd-NodeName 1/1 Running 34 213d
kube-apiserver-NodeName 1/1 Running 85 213d
kube-controller-manager-NodeName 1/1 Running 790 213d
kube-proxy-gd5jx 1/1 Running 9 213d
kube-scheduler-NodeName 1/1 Running 800 213d
local-path-provisioner-56db8cbdb5-gqgqr 1/1 Running 3 44d
# kubectl version
Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T19:18:23Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T19:09:08Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}
Kubelet has been initialized with pod network for Calico :
sudo kubeadm init --pod-network-cidr=192.168.0.0/16 --image-repository=someserver
Then i get calico.yaml v3.11 and applied it :
sudo kubectl --kubeconfig="/etc/kubernetes/admin.conf" apply -f calico.yaml
Right after i check on the pod status :
sudo kubectl --kubeconfig="/etc/kubernetes/admin.conf" get nodes
NAME STATUS ROLES AGE VERSION
master-1 NotReady master 7m21s v1.17.2
on describe i've got cni config unitialized, but i thought that calico should have done that ?
MemoryPressure False Fri, 21 Feb 2020 10:14:24 +0100 Fri, 21 Feb 2020 10:09:00 +0100 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Fri, 21 Feb 2020 10:14:24 +0100 Fri, 21 Feb 2020 10:09:00 +0100 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Fri, 21 Feb 2020 10:14:24 +0100 Fri, 21 Feb 2020 10:09:00 +0100 KubeletHasSufficientPID kubelet has sufficient PID available
Ready False Fri, 21 Feb 2020 10:14:24 +0100 Fri, 21 Feb 2020 10:09:00 +0100 KubeletNotReady runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
In fact i have nothing under /etc/cni/net.d/ so it seems it forgot something ?
ll /etc/cni/net.d/
total 0
sudo kubectl --kubeconfig="/etc/kubernetes/admin.conf" -n kube-system get pods
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-5644fb7cf6-f7lqq 0/1 Pending 0 3h
calico-node-f4xzh 0/1 Init:ImagePullBackOff 0 3h
coredns-7fb8cdf968-bbqbz 0/1 Pending 0 3h24m
coredns-7fb8cdf968-vdnzx 0/1 Pending 0 3h24m
etcd-master-1 1/1 Running 0 3h24m
kube-apiserver-master-1 1/1 Running 0 3h24m
kube-controller-manager-master-1 1/1 Running 0 3h24m
kube-proxy-9m879 1/1 Running 0 3h24m
kube-scheduler-master-1 1/1 Running 0 3h24m
As explained i'm running through a local repo and journalctl says :
kubelet[21935]: E0225 14:30:54.830683 21935 pod_workers.go:191] Error syncing pod cec2f72b-844a-4d6b-8606-3aff06d4a36d ("calico-node-f4xzh_kube-system(cec2f72b-844a-4d6b-8606-3aff06d4a36d)"), skipping: failed to "StartContainer" for "upgrade-ipam" with ErrImagePull: "rpc error: code = Unknown desc = Error response from daemon: Get https://repo:10000/v2/calico/cni/manifests/v3.11.2: no basic auth credentials"
kubelet[21935]: E0225 14:30:56.008989 21935 kubelet.go:2183] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Feels like it's not only CNI the issue
Core DNS pod will be pending and master will be NotReady till calico pods are successfully running and CNI is setup properly.
It seems to be network issue to download calico docker images from docker.io. So you can pull calico images from docker.io and and push it to your internal container registry and then modify the calico yaml to refer that registry in images section of calico.yaml and finally apply the modified calico yaml to the kubernetes cluster.
So the issue with Init:ImagePullBackOff was that it cannot apply image from my private repo automatically. I had to pull all images for calico from docker. Then i deleted the calico pod it's recreate itself with the newly pushed image
sudo docker pull private-repo/calico/pod2daemon-flexvol:v3.11.2
sudo docker pull private-repo/calico/node:v3.11.2
sudo docker pull private-repo/calico/cni:v3.11.2
sudo docker pull private-repo/calico/kube-controllers:v3.11.2
sudo kubectl -n kube-system delete po/calico-node-y7g5
After that the node re-do all the init phase and :
sudo kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-5644fb7cf6-qkf47 1/1 Running 0 11s
calico-node-mkcsr 1/1 Running 0 21m
coredns-7fb8cdf968-bgqvj 1/1 Running 0 37m
coredns-7fb8cdf968-v85jx 1/1 Running 0 37m
etcd-lin-1k8w1dv-vmh 1/1 Running 0 38m
kube-apiserver-lin-1k8w1dv-vmh 1/1 Running 0 38m
kube-controller-manager-lin-1k8w1dv-vmh 1/1 Running 0 38m
kube-proxy-9hkns 1/1 Running 0 37m
kube-scheduler-lin-1k8w1dv-vmh 1/1 Running 0 38m
I am trying to setup a basic k8s cluster
After doing a kubeadm init --pod-network-cidr=10.244.0.0/16, the coredns pods are stuck in ContainerCreating status
NAME READY STATUS RESTARTS AGE
coredns-6955765f44-2cnhj 0/1 ContainerCreating 0 43h
coredns-6955765f44-dnphb 0/1 ContainerCreating 0 43h
etcd-perf1 1/1 Running 0 43h
kube-apiserver-perf1 1/1 Running 0 43h
kube-controller-manager-perf1 1/1 Running 0 43h
kube-flannel-ds-amd64-smpbk 1/1 Running 0 43h
kube-proxy-6zgvn 1/1 Running 0 43h
kube-scheduler-perf1 1/1 Running 0 43h
OS-IMAGE: Ubuntu 16.04.6 LTS
KERNEL-VERSION: 4.4.0-142-generic
CONTAINER-RUNTIME: docker://19.3.5
Errors from journalctl -xeu kubelet command
Jan 02 10:31:44 perf1 kubelet[11901]: 2020-01-02 10:31:44.112 [INFO][10207] k8s.go 228: Using Calico IPAM
Jan 02 10:31:44 perf1 kubelet[11901]: E0102 10:31:44.118281 11901 cni.go:385] Error deleting kube-system_coredns-6955765f44-2cnhj/12cd9435dc905c026bbdb4a1954fc36c82ede1d703b040a3052ab3370445abbf from
Jan 02 10:31:44 perf1 kubelet[11901]: E0102 10:31:44.118828 11901 remote_runtime.go:128] StopPodSandbox "12cd9435dc905c026bbdb4a1954fc36c82ede1d703b040a3052ab3370445abbf" from runtime service failed:
Jan 02 10:31:44 perf1 kubelet[11901]: E0102 10:31:44.118872 11901 kuberuntime_manager.go:898] Failed to stop sandbox {"docker" "12cd9435dc905c026bbdb4a1954fc36c82ede1d703b040a3052ab3370445abbf"}
Jan 02 10:31:44 perf1 kubelet[11901]: E0102 10:31:44.118917 11901 kuberuntime_manager.go:676] killPodWithSyncResult failed: failed to "KillPodSandbox" for "e44bc42f-0b8d-40ad-82a9-334a1b1c8e40" with
Jan 02 10:31:44 perf1 kubelet[11901]: E0102 10:31:44.118939 11901 pod_workers.go:191] Error syncing pod e44bc42f-0b8d-40ad-82a9-334a1b1c8e40 ("coredns-6955765f44-2cnhj_kube-system(e44bc42f-0b8d-40ad-
Jan 02 10:31:47 perf1 kubelet[11901]: W0102 10:31:47.081709 11901 cni.go:331] CNI failed to retrieve network namespace path: cannot find network namespace for the terminated container "747c3cc9455a7d
Jan 02 10:31:47 perf1 kubelet[11901]: 2020-01-02 10:31:47.113 [INFO][10267] k8s.go 228: Using Calico IPAM
Jan 02 10:31:47 perf1 kubelet[11901]: E0102 10:31:47.118526 11901 cni.go:385] Error deleting kube-system_coredns-6955765f44-dnphb/747c3cc9455a7db202ab14576d15509d8ef6967c6349e9acbeff2207914d3d53 from
Jan 02 10:31:47 perf1 kubelet[11901]: E0102 10:31:47.119017 11901 remote_runtime.go:128] StopPodSandbox "747c3cc9455a7db202ab14576d15509d8ef6967c6349e9acbeff2207914d3d53" from runtime service failed:
Jan 02 10:31:47 perf1 kubelet[11901]: E0102 10:31:47.119052 11901 kuberuntime_manager.go:898] Failed to stop sandbox {"docker" "747c3cc9455a7db202ab14576d15509d8ef6967c6349e9acbeff2207914d3d53"}
Jan 02 10:31:47 perf1 kubelet[11901]: E0102 10:31:47.119098 11901 kuberuntime_manager.go:676] killPodWithSyncResult failed: failed to "KillPodSandbox" for "52ffb25e-06c7-4cc6-be70-540049a6be20" with
Jan 02 10:31:47 perf1 kubelet[11901]: E0102 10:31:47.119119 11901 pod_workers.go:191] Error syncing pod 52ffb25e-06c7-4cc6-be70-540049a6be20 ("coredns-6955765f44-dnphb_kube-system(52ffb25e-06c7-4cc6-
I have tried kubdeadm reset as well but no luck so far
Looks like the issue was because I tried switching from calico to flannel cni. Following the steps mentioned here has resolved the issue for me
Pods failed to start after switch cni plugin from flannel to calico and then flannel
Additionally you may have to clear the contents of /etc/cni/net.d
CoreDNS will not start up before a CNI network is installed.
For flannel to work correctly, you must pass --pod-network-cidr=10.244.0.0/16 to kubeadm init.
Set /proc/sys/net/bridge/bridge-nf-call-iptables to 1 by running sysctl net.bridge.bridge-nf-call-iptables=1 to pass bridged IPv4 traffic to iptables’ chains. This is a requirement for some CNI plugins to work.
Make sure that your firewall rules allow UDP ports 8285 and 8472 traffic for all hosts participating in the overlay network. see here .
Note that flannel works on amd64, arm, arm64, ppc64le and s390x under Linux. Windows (amd64) is claimed as supported in v0.11.0 but the usage is undocumented
To deploy flannel as CNI network
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/2140ac876ef134e0ed5af15c65e414cf26827915/Documentation/kube-flannel.yml
After you have deployed flannel delete the core dns pods, Kubernetes will recreate the pods.
You have deployed flannel as CNI but the logs from kubelet shows that kubernetes is using calico.
[INFO][10207] k8s.go 228: Using Calico IPAM
Something wrong with container network. without that coredns doesnt succeed.
You might have to reinstall with correct CNI. Once CNI is deployed successfully, coreDNS gets deployed automatically
So here is my solution:
First, coreDNS will run on your [Master / Control-Plane] Nodes
Now let's run ifconfig to check for these 2 interfaces cni0 and flannel.1
Suppose cni0=10.244.1.1 & flannel.1=10.244.0.0 then your DNS will not be created
It should be cni0=10.244.0.1 & flannel.1=10.244.0.0. Which mean cni0 must follow flannel.1/24 patterns
Run the following 2 command to Down Interface and Remove it from your Master/Control-Plane Machines
sudo ifconfig cni0 down;
sudo ip link delete cni0;
Now check via ifconfig you will see 2 more vethxxxxxxxx Interface appears. This should fixed your problem.
I am running K8s master(ubuntu 16.04) and node(ubuntu 16.04) on Hyper-V's Vm nor and able to join a node nor coredns pods are ready.
On k8s Worker Node:
admin1#POC-k8s-node1:~$ sudo kubeadm join 192.168.137.2:6443 --token s03usq.lrz343lolmrz00lf --discovery-token-ca-cert-hash sha256:5c6b88a78e7b303debda447fa6f7fb48e3746bedc07dc2a518fbc80d48f37ba4 --ignore-preflight-errors=all
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 19.03.5. Latest validated version: 18.09
[WARNING Port-10250]: Port 10250 is in use
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.16" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[kubelet-check] Initial timeout of 40s passed.
error execution phase kubelet-start: error uploading crisocket: timed out waiting for the condition
To see the stack trace of this error execute with --v=5 or higher
admin1#POC-k8s-node1:~$ journalctl -u kubelet -f
Nov 21 05:28:15 POC-k8s-node1 kubelet[55491]: E1121 05:28:15.784713 55491 reflector.go:123] k8s.io/kubernetes/pkg/kubelet/kubelet.go:459: Failed to list *v1.Node: Unauthorized
Nov 21 05:28:15 POC-k8s-node1 kubelet[55491]: E1121 05:28:15.827982 55491 kubelet.go:2267] node "poc-k8s-node1" not found
Nov 21 05:28:15 POC-k8s-node1 kubelet[55491]: E1121 05:28:15.928413 55491 kubelet.go:2267] node "poc-k8s-node1" not found
Nov 21 05:28:15 POC-k8s-node1 kubelet[55491]: E1121 05:28:15.988489 55491 reflector.go:123] k8s.io/client-go/informers/factory.go:134: Failed to list *v1beta1.RuntimeClass: Unauthorized
Nov 21 05:28:16 POC-k8s-node1 kubelet[55491]: E1121 05:28:16.029295 55491 kubelet.go:2267] node "poc-k8s-node1" not found
Nov 21 05:28:16 POC-k8s-node1 kubelet[55491]: E1121 05:28:16.129571 55491 kubelet.go:2267] node "poc-k8s-node1" not found
Nov 21 05:28:16 POC-k8s-node1 kubelet[55491]: E1121 05:28:16.187178 55491 reflector.go:123] k8s.io/client-go/informers/factory.go:134: Failed to list *v1beta1.CSIDriver: Unauthorized
Nov 21 05:28:16 POC-k8s-node1 kubelet[55491]: E1121 05:28:16.230227 55491 kubelet.go:2267] node "poc-k8s-node1" not found
Nov 21 05:28:16 POC-k8s-node1 kubelet[55491]: E1121 05:28:16.330777 55491 kubelet.go:2267] node "poc-k8s-node1" not found
Nov 21 05:28:16 POC-k8s-node1 kubelet[55491]: E1121 05:28:16.386758 55491 reflector.go:123] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:46: Failed to list *v1.Pod: Unauthorized
Nov 21 05:28:16 POC-k8s-node1 kubelet[55491]: E1121 05:28:16.431420 55491 kubelet.go:2267] node "poc-k8s-node1" not found
root#POC-k8s-node1:/home/admin1# journalctl -xe -f
Nov 21 06:30:45 POC-k8s-node1 kubelet[75467]: E1121 06:30:45.670520 75467 reflector.go:123] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:46: Failed to list *v1.Pod: Unauthorized
Nov 21 06:30:45 POC-k8s-node1 kubelet[75467]: E1121 06:30:45.691050 75467 kubelet.go:2267] node "poc-k8s-node1" not found
Nov 21 06:30:45 POC-k8s-node1 kubelet[75467]: E1121 06:30:45.791249 75467 kubelet.go:2267] node "poc-k8s-node1" not found
Nov 21 06:30:45 POC-k8s-node1 kubelet[75467]: E1121 06:30:45.866004
On K8s Master :
root#POC-k8s-master:~# kubeadm config images pull
[config/images] Pulled k8s.gcr.io/kube-apiserver:v1.16.3
[config/images] Pulled k8s.gcr.io/kube-controller-manager:v1.16.3
[config/images] Pulled k8s.gcr.io/kube-scheduler:v1.16.3
[config/images] Pulled k8s.gcr.io/kube-proxy:v1.16.3
[config/images] Pulled k8s.gcr.io/pause:3.1
[config/images] Pulled k8s.gcr.io/etcd:3.3.15-0
[config/images] Pulled k8s.gcr.io/coredns:1.6.2
root#POC-k8s-master:~# export KUBECONFIG=/etc/kubernetes/admin.conf
root#POC-k8s-master:~# sysctl net.bridge.bridge-nf-call-iptables=1
net.bridge.bridge-nf-call-iptables = 1
root#POC-k8s-master:~# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
*****kube-system coredns-5644d7b6d9-7xk42 0/1 Pending 0 91s
kube-system coredns-5644d7b6d9-mbrlx 0/1 Pending 0 91s*****
kube-system etcd-poc-k8s-master 1/1 Running 0 51s
kube-system kube-apiserver-poc-k8s-master 1/1 Running 0 32s
kube-system kube-controller-manager-poc-k8s-master 1/1 Running 0 47s
kube-system kube-proxy-xqb2d 1/1 Running 0 91s
kube-system kube-scheduler-poc-k8s-master 1/1 Running 0 38s
root#POC-k8s-master:~# kubectl apply -f
https://raw.githubusercontent.com/coreos/flannel/c5d10c8/Documentation/kube-flannel.yml
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
unable to recognize "https://raw.githubusercontent.com/coreos/flannel/c5d10c8/Documentation/kube-flannel.yml": no matches for kind "DaemonSet" in version "extensions/v1beta1"
unable to recognize "https://raw.githubusercontent.com/coreos/flannel/c5d10c8/Documentation/kube-flannel.yml": no matches for kind "DaemonSet" in version "extensions/v1beta1"
unable to recognize "https://raw.githubusercontent.com/coreos/flannel/c5d10c8/Documentation/kube-flannel.yml": no matches for kind "DaemonSet" in version "extensions/v1beta1"
unable to recognize "https://raw.githubusercontent.com/coreos/flannel/c5d10c8/Documentation/kube-flannel.yml": no matches for kind "DaemonSet" in version "extensions/v1beta1"
unable to recognize "https://raw.githubusercontent.com/coreos/flannel/c5d10c8/Documentation/kube-flannel.yml": no matches for kind "DaemonSet" in version "extensions/v1beta1"
It seems you're using k8s version 1.16 and daemonset API group change to apps/v1
Update the link to this:
https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
And also there is an issue about this out there:
https://github.com/kubernetes/website/issues/16441
Resolved first part of the question by "#kubeadm reset" on the node and then join command worked! As 2nd part of the question was resolved first hence it was possible to resolve the question so #Alireza David thanks a lot.
I've been working with a 6 node cluster for the last few weeks without issue. Earlier today we ran into an open file issue (https://github.com/kubernetes/kubernetes/pull/12443/files) and I patched and restarted kube-proxy.
Since then, all rc deployed pods to ALL BUT node-01 get stuck in pending state and there log messages stating the cause.
Looking at the docker daemon on the nodes, the containers in the pod are actually running and a delete of the rc removes them. It appears to be some sort of callback issue between the state according to kubelet and the kube-apiserver.
Cluster is running v1.0.3
Here's an example of the state
docker run --rm -it lachie83/kubectl:prod get pods --namespace=kube-system -o wide
NAME READY STATUS RESTARTS AGE NODE
kube-dns-v8-i0yac 0/4 Pending 0 4s 10.1.1.35
kube-dns-v8-jti2e 0/4 Pending 0 4s 10.1.1.34
get events
Wed, 16 Sep 2015 06:25:42 +0000 Wed, 16 Sep 2015 06:25:42 +0000 1 kube-dns-v8 ReplicationController successfulCreate {replication-controller } Created pod: kube-dns-v8-i0yac
Wed, 16 Sep 2015 06:25:42 +0000 Wed, 16 Sep 2015 06:25:42 +0000 1 kube-dns-v8-i0yac Pod scheduled {scheduler } Successfully assigned kube-dns-v8-i0yac to 10.1.1.35
Wed, 16 Sep 2015 06:25:42 +0000 Wed, 16 Sep 2015 06:25:42 +0000 1 kube-dns-v8-jti2e Pod scheduled {scheduler } Successfully assigned kube-dns-v8-jti2e to 10.1.1.34
Wed, 16 Sep 2015 06:25:42 +0000 Wed, 16 Sep 2015 06:25:42 +0000 1 kube-dns-v8 ReplicationController successfulCreate {replication-controller } Created pod: kube-dns-v8-jti2e
scheduler log
I0916 06:25:42.897814 10076 event.go:203] Event(api.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"kube-dns-v8-jti2e", UID:"c1cafebe-5c3b-11e5-b3c4-020443b6797d", APIVersion:"v1", ResourceVersion:"670117", FieldPath:""}): reason: 'scheduled' Successfully assigned kube-dns-v8-jti2e to 10.1.1.34
I0916 06:25:42.904195 10076 event.go:203] Event(api.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"kube-dns-v8-i0yac", UID:"c1cafc69-5c3b-11e5-b3c4-020443b6797d", APIVersion:"v1", ResourceVersion:"670118", FieldPath:""}): reason: 'scheduled' Successfully assigned kube-dns-v8-i0yac to 10.1.1.35
tailing kubelet log file during pod create
tail -f kubelet.kube-node-03.root.log.INFO.20150916-060744.10668
I0916 06:25:04.448916 10668 config.go:253] Setting pods for source file : {[] 0 file}
I0916 06:25:24.449253 10668 config.go:253] Setting pods for source file : {[] 0 file}
I0916 06:25:44.449522 10668 config.go:253] Setting pods for source file : {[] 0 file}
I0916 06:26:04.449774 10668 config.go:253] Setting pods for source file : {[] 0 file}
I0916 06:26:24.450400 10668 config.go:253] Setting pods for source file : {[] 0 file}
I0916 06:26:44.450995 10668 config.go:253] Setting pods for source file : {[] 0 file}
I0916 06:27:04.451501 10668 config.go:253] Setting pods for source file : {[] 0 file}
I0916 06:27:24.451910 10668 config.go:253] Setting pods for source file : {[] 0 file}
I0916 06:27:44.452511 10668 config.go:253] Setting pods for source file : {[] 0 file}
kubelet process
root#kube-node-03:/var/log/kubernetes# ps -ef | grep kubelet
root 10668 1 1 06:07 ? 00:00:13 /opt/bin/kubelet --address=10.1.1.34 --port=10250 --hostname_override=10.1.1.34 --api_servers=https://kube-master-01.sj.lithium.com:6443 --logtostderr=false --log_dir=/var/log/kubernetes --cluster_dns=10.1.2.53 --config=/etc/kubelet/conf --cluster_domain=prod-kube-sjc1-1.internal --v=4 --tls-cert-file=/etc/kubelet/certs/kubelet.pem --tls-private-key-file=/etc/kubelet/certs/kubelet-key.pem
node list
docker run --rm -it lachie83/kubectl:prod get nodes
NAME LABELS STATUS
10.1.1.30 kubernetes.io/hostname=10.1.1.30,name=node-1 Ready
10.1.1.32 kubernetes.io/hostname=10.1.1.32,name=node-2 Ready
10.1.1.34 kubernetes.io/hostname=10.1.1.34,name=node-3 Ready
10.1.1.35 kubernetes.io/hostname=10.1.1.35,name=node-4 Ready
10.1.1.42 kubernetes.io/hostname=10.1.1.42,name=node-5 Ready
10.1.1.43 kubernetes.io/hostname=10.1.1.43,name=node-6 Ready
The issue turned out to be an MTU issue between the node and the master. Once that was fixed the problem was resolved.
Looks like you were building your cluster from scratch. Have you run conformance test against your cluster yet? If no, could you please run it and the detail information can be found at:
https://github.com/kubernetes/kubernetes/blob/e8009e828c864a46bf2e1d5c7dab8ef413c8bbe5/hack/conformance-test.sh
The conformance test should failed, or at least give us more information on your cluster setup. Please post the test result somewhere, so that we can diagnose your problem more.
The problem most likely your kubelet and your kube-apiserver don't agree upon the node name here. And I also noticed that you are using hostname_override too.