Kubernetes coredns pods not running in every pods - kubernetes

With new installation of Kubernetes on Ubuntu with one master and two nodes,
root#master1# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master1 Ready master 10h v1.19.3 10.10.10.216 <none> Ubuntu 18.04.5 LTS 4.15.0-122-generic docker://19.3.13
worker1 Ready <none> 10h v1.19.3 10.10.10.211 <none> Ubuntu 18.04.5 LTS 4.15.0-122-generic docker://19.3.13
worker2 Ready <none> 10h v1.19.3 10.10.10.212 <none> Ubuntu 18.04.5 LTS 4.15.0-122-generic docker://19.3.13
i check if all pods in namespace kube-system works.
root#master1:# kubectl get pods -A -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system coredns-f9fd979d6-cggnh 1/1 Running 0 10h 10.244.0.2 master1 <none> <none>
kube-system coredns-f9fd979d6-tnm7c 1/1 Running 0 10h 10.244.0.3 master1 <none> <none>
kube-system etcd-master1 1/1 Running 0 10h 10.10.10.216 master1 <none> <none>
kube-system kube-apiserver-master1 1/1 Running 0 10h 10.10.10.216 master1 <none> <none>
kube-system kube-controller-manager-master1 1/1 Running 0 10h 10.10.10.216 master1 <none> <none>
kube-system kube-flannel-ds-9ph5c 1/1 Running 0 10h 10.10.10.216 master1 <none> <none>
kube-system kube-flannel-ds-fjkng 1/1 Running 0 10h 10.10.10.212 worker2 <none> <none>
kube-system kube-flannel-ds-rfkqd 1/1 Running 0 9h 10.10.10.211 worker1 <none> <none>
kube-system kube-proxy-j7s2m 1/1 Running 0 10h 10.10.10.216 master1 <none> <none>
kube-system kube-proxy-n7279 1/1 Running 0 10h 10.10.10.212 worker2 <none> <none>
kube-system kube-proxy-vkb66 1/1 Running 0 9h 10.10.10.211 worker1 <none> <none>
kube-system kube-scheduler-master1 1/1 Running 0 10h 10.10.10.216 master1 <none> <none>
I see that coredns is work only in master with two pods
How i should do to replicate coredns in all my 3 vm (master + 2 nodes)
this is the description of coredns deployment
root#master1:# kubectl describe deployment coredns -n kube-system
Name: coredns
Namespace: kube-system
CreationTimestamp: Wed, 04 Nov 2020 20:32:10 +0000
Labels: k8s-app=kube-dns
Annotations: deployment.kubernetes.io/revision: 1
Selector: k8s-app=kube-dns
Replicas: 2 desired | 2 updated | 2 total | 2 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 1 max unavailable, 25% max surge
Pod Template:
Labels: k8s-app=kube-dns
Service Account: coredns
Containers:
coredns:
Image: k8s.gcr.io/coredns:1.7.0
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:8181/ready delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/etc/coredns from config-volume (ro)
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
Priority Class Name: system-cluster-critical
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing True NewReplicaSetAvailable
OldReplicaSets: <none>
NewReplicaSet: coredns-f9fd979d6 (2/2 replicas created)
Events: <none>
also , the logs and the stats of deployment
root#master1# kubectl logs deployment/coredns -n kube-system
Found 2 pods, using pod/coredns-f9fd979d6-cggnh
.:53
[INFO] plugin/reload: Running configuration MD5 = db32ca3650231d74073ff4cf814959a7
CoreDNS-1.7.0
linux/amd64, go1.14.4, f59c03d
root#master1:# kubectl get deployment -o wide -n kube-system
NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
coredns 2/2 2 2 10h coredns k8s.gcr.io/coredns:1.7.0 k8s-app=kube-dns

Andre, you can add podAntiAffinity to your coredns definition:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
k8s-app: kube-dns
topologyKey: kubernetes.io/hostname
This will let your coredns replicas scheduling to different nodes.

Related

Unable to reach pod service from kubernetes master node , from worker nodes it is working

I have done a fresh kubernetes installation in my vm setup .I have two centos-8 servers which are master and slave. both are configured with 'network bridged'. kubernetes version is v1.21.9 , docker version is 23.0.0. I have deployed a simple hello world nodejs app as pod. these are the currently running pods
The issue Is I Can access the pod service through it's nod's IP address as http://192.168.1.27:31500/ But I'm unable to access the pod service from master node( expecting it to work in http://192.168.1.26:31500/) , can some one help me to resolve this?
there are no restarts in k8 network components and as I have checked there are no errors in kube-proxy pods
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default helloworldnodejsapp-deployment-86966cfcc5-85dgm 1/1 Running 0 17m 10.244.1.2 worker-server27 <none> <none>
kube-flannel kube-flannel-ds-226w7 1/1 Running 0 24m 192.168.1.27 worker-server27 <none> <none>
kube-flannel kube-flannel-ds-4cdhn 1/1 Running 0 63m 192.168.1.26 master-server26 <none> <none>
kube-system coredns-558bd4d5db-ht6sp 1/1 Running 0 63m 10.244.0.3 master-server26 <none> <none>
kube-system coredns-558bd4d5db-wq774 1/1 Running 0 63m 10.244.0.2 master-server26 <none> <none>
kube-system etcd-master-server26 1/1 Running 0 64m 192.168.1.26 master-server26 <none> <none>
kube-system kube-apiserver-master-server26 1/1 Running 0 64m 192.168.1.26 master-server26 <none> <none>
kube-system kube-controller-manager-master-server26 1/1 Running 0 64m 192.168.1.26 master-server26 <none> <none>
kube-system kube-proxy-ftsmp 1/1 Running 0 63m 192.168.1.26 master-server26 <none> <none>
kube-system kube-proxy-xhccg 1/1 Running 0 24m 192.168.1.27 worker-server27 <none> <none>
kube-system kube-scheduler-master-server26 1/1 Running 0 64m 192.168.1.26 master-server26 <none> <none>
Node details
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master-server26 Ready control-plane,master 70m v1.21.9 192.168.1.26 <none> CentOS Stream 8 4.18.0-448.el8.x86_64 docker://23.0.0
worker-server27 Ready <none> 30m v1.21.9 192.168.1.27 <none> CentOS Stream 8 4.18.0-448.el8.x86_64 docker://23.0.0
configuration of /etc/docker/daemon.json
{
"exec-opts": ["native.cgroupdriver=systemd"],"dns": ["8.8.8.8", "8.8.4.4","192.168.1.1"]
}
Hello world pod deployment yaml file
apiVersion: apps/v1
kind: Deployment
metadata:
name: helloworldnodejsapp-deployment
labels:
app: web
spec:
selector:
matchLabels:
app: web
replicas: 1
strategy:
type: RollingUpdate
template:
metadata:
labels:
app: web
spec:
containers:
- name: helloworldnodejsapp
image: "********:helloworldnodejs"
ports:
- containerPort: 8010
imagePullSecrets:
- name: regcred
---
apiVersion: v1
kind: Service
metadata:
name: helloworldnodejsapp-svc
labels:
app: web
spec:
type: NodePort
selector:
app: web
ports:
- port: 8010
targetPort: 8010
nodePort: 31500
Form the explanation I got following details
Node IP: 192.168.1.27
Master node IP: 192.168.1.26
Port: 31500
And you want to access the app using your master node IP which is 192.168.1.26. By default you can’t access your application directly using your master node ip because the pod is present on your worker node(192.168.1.27) even when you configured NodePort it will be binded to the worker node’s IP. So you need to expose your application using the clusterIP for accessing your application using the master node IP. follow this documentation for more details.

Nginx Ingress Controller not curling localhost on worker nodes

CentOS 7, 3 VMs -- 1 master and 2 workers, Kubernetes 1.26 (kubelet is 1.25.5.0), cri-dockerd, Calico CNI
# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
rxxxx-vm1 Ready control-plane 4h48m v1.25.5 10.253.137.20 <none> CentOS Linux 7 (Core) 3.10.0-1160.80.1.el7.x86_64 docker://20.10.22
rxxxx-vm2 Ready <none> 4h27m v1.25.5 10.253.137.17 <none> CentOS Linux 7 (Core) 3.10.0-1160.80.1.el7.x86_64 docker://20.10.22
rxxxx-vm3 Ready <none> 4h27m v1.25.5 10.253.137.10 <none> CentOS Linux 7 (Core) 3.10.0-1160.80.1.el7.x86_64 docker://20.10.22
NGINX Ingress controller is deployed as a daemonset:
# kubectl get po -A -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
calico-apiserver calico-apiserver-685568b969-b8mfr 1/1 Running 0 4h50m 172.17.0.6 rxxxx-vm1 <none> <none>
calico-apiserver calico-apiserver-685568b969-xrj2h 1/1 Running 0 4h50m 172.17.0.7 rxxxx-vm1 <none> <none>
calico-system calico-kube-controllers-67df98bdc8-2zdnj 1/1 Running 0 4h51m 172.17.0.4 rxxxx-vm1 <none> <none>
calico-system calico-node-498bb 1/1 Running 0 4h51m 10.253.137.20 rxxxx-vm1 <none> <none>
calico-system calico-node-sblv9 1/1 Running 0 4h30m 10.253.137.17 rxxxx-vm2 <none> <none>
calico-system calico-node-zkn28 1/1 Running 0 4h29m 10.253.137.10 rxxxx-vm3 <none> <none>
calico-system calico-typha-76c8f59f87-mq52d 1/1 Running 0 4h29m 10.253.137.10 rxxxx-vm3 <none> <none>
calico-system calico-typha-76c8f59f87-zk6jr 1/1 Running 0 4h51m 10.253.137.20 rxxxx-vm1 <none> <none>
kube-system coredns-787d4945fb-6mq5k 1/1 Running 0 4h51m 172.17.0.3 rxxxx-vm1 <none> <none>
kube-system coredns-787d4945fb-kmqcv 1/1 Running 0 4h51m 172.17.0.2 rxxxx-vm1 <none> <none>
kube-system etcd-rxxxx-vm1 1/1 Running 0 4h51m 10.253.137.20 rxxxx-vm1 <none> <none>
kube-system kube-apiserver-rxxxx-vm1 1/1 Running 0 4h51m 10.253.137.20 rxxxx-vm1 <none> <none>
kube-system kube-controller-manager-rxxxx-vm1 1/1 Running 0 4h51m 10.253.137.20 rxxxx-vm1 <none> <none>
kube-system kube-proxy-g9dbt 1/1 Running 0 4h29m 10.253.137.10 rxxxx-vm3 <none> <none>
kube-system kube-proxy-mnzks 1/1 Running 0 4h30m 10.253.137.17 rxxxx-vm2 <none> <none>
kube-system kube-proxy-n98xb 1/1 Running 0 4h51m 10.253.137.20 rxxxx-vm1 <none> <none>
kube-system kube-scheduler-rxxxx-vm1 1/1 Running 0 4h51m 10.253.137.20 rxxxx-vm1 <none> <none>
nginx-ingress nginx-ingress-2chhn 1/1 Running 0 4h29m 172.17.0.2 rxxxx-vm3 <none> <none>
nginx-ingress nginx-ingress-95h7s 1/1 Running 0 4h30m 172.17.0.2 rxxxx-vm2 <none> <none>
nginx-ingress nginx-ingress-wbxng 1/1 Running 0 4h51m 172.17.0.5 rxxxx-vm1 <none> <none>
play apple-app 1/1 Running 0 4h45m 172.17.0.8 rxxxx-vm1 <none> <none>
play banana-app 1/1 Running 0 4h45m 172.17.0.9 rxxxx-vm1 <none> <none>
tigera-operator tigera-operator-7795f5d79b-hmm5g 1/1 Running 0 4h51m 10.253.137.20 rxxxx-vm1 <none> <none>
Services:
# kubectl get svc -A -o wide
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
calico-apiserver calico-api ClusterIP 10.111.117.42 <none> 443/TCP 6h34m apiserver=true
calico-system calico-kube-controllers-metrics ClusterIP 10.99.121.254 <none> 9094/TCP 6h35m k8s-app=calico-kube-controllers
calico-system calico-typha ClusterIP 10.104.50.90 <none> 5473/TCP 6h35m k8s-app=calico-typha
default kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 6h36m <none>
kube-system kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 6h36m k8s-app=kube-dns
play apple-service ClusterIP 10.98.78.251 <none> 5678/TCP 6h29m app=apple
play banana-service ClusterIP 10.103.87.112 <none> 5678/TCP 6h29m app=banana
Service details:
# kubectl -n play describe svc apple-service
Name: apple-service
Namespace: play
Labels: <none>
Annotations: <none>
Selector: app=apple
Type: ClusterIP
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.98.78.251
IPs: 10.98.78.251
Port: <unset> 5678/TCP
TargetPort: 5678/TCP
Endpoints: 172.17.0.8:5678
Session Affinity: None
Events: <none>
Endpoints:
# kubectl get ep -A
NAMESPACE NAME ENDPOINTS AGE
calico-apiserver calico-api 172.17.0.6:5443,172.17.0.7:5443 6h39m
calico-system calico-kube-controllers-metrics 172.17.0.4:9094 6h39m
calico-system calico-typha 10.253.137.10:5473,10.253.137.20:5473 6h40m
default kubernetes 10.253.137.20:6443 6h40m
kube-system kube-dns 172.17.0.2:53,172.17.0.3:53,172.17.0.2:53 + 3 more... 6h40m
play apple-service 172.17.0.8:5678 6h34m
play banana-service 172.17.0.9:5678 6h34m
Endpoint details:
# kubectl -n play describe ep apple-service
Name: apple-service
Namespace: play
Labels: <none>
Annotations: endpoints.kubernetes.io/last-change-trigger-time: 2023-01-11T20:21:27Z
Subsets:
Addresses: 172.17.0.8
NotReadyAddresses: <none>
Ports:
Name Port Protocol
---- ---- --------
<unset> 5678 TCP
Events: <none>
Ingress resource:
# kubectl get ing -A -o wide
NAMESPACE NAME CLASS HOSTS ADDRESS PORTS AGE
play example-ingress nginx localhost 80 6h30m
Ingress details:
# kubectl -n play describe ing example-ingress
Name: example-ingress
Labels: <none>
Namespace: play
Address:
Ingress Class: nginx
Default backend: <default>
Rules:
Host Path Backends
---- ---- --------
localhost
/apple apple-service:5678 (172.17.0.8:5678)
/banana banana-service:5678 (172.17.0.9:5678)
Annotations: ingress.kubernetes.io/rewrite-target: /
Events: <none>
QUESTION:
While curl -kL http://localhost/apple on master node returns apple the same command produces below output on worker nodes:
<html>
<head><title>502 Bad Gateway</title></head>
<body>
<center><h1>502 Bad Gateway</h1></center>
<hr><center>nginx/1.23.2</center>
</body>
</html>
My understanding is that if the ingress controller pod is running on every node, then localhost should be resolved as this is defined as host in ingress resource definition. Is this understanding incorrect? If not, what am I doing wrong?
When I look at the corresponding node's ingress controller's pod's logs, I see this:
2023/01/12 03:02:05 [error] 68#68: *11 connect() failed (113: No route to host) while connecting to upstream, client: 172.17.0.1, server: localhost, request: "GET /apple HTTP/1.1", upstream: "http://172.17.0.8:5678/apple", host: "localhost"
172.17.0.1 - - [12/Jan/2023:03:02:05 +0000] "GET /apple HTTP/1.1" 502 157 "-" "curl/7.29.0" "-"

Pods on different node not working kubernetes

I hope someone can give me some help about this issue.
I'm testing a containerized microservice over a kubernetes cluster made by 2 nodes:
Merry -> master (and worker)
Pippin -> worker
This is my deployment:
kind: Deployment
metadata:
name: resize
spec:
selector:
matchLabels:
run: resize
replicas: 1
template:
metadata:
labels:
run: resize
spec:
containers:
- name: resize
image: mdndocker/simpleweb
ports:
- containerPort: 1337
resources:
limits:
cpu: 200m
requests:
cpu: 100m
This is the service:
apiVersion: v1
kind: Service
metadata:
name: resize
labels:
run: resize
spec:
type: ClusterIP
ports:
- port: 8080
protocol: TCP
targetPort: 1337
selector:
run: resize
I'm using calico network.
I scaled the replicas before to 0 and than to 8 for have multiple instances of my app in both nodes.
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
locust-77c699c94d-k8ssz 1/1 Running 0 17m 192.168.61.160 pippin <none> <none>
resize-d8cd49f6c-2tk62 1/1 Running 0 64m 192.168.61.158 pippin <none> <none>
resize-d8cd49f6c-6g2f9 1/1 Running 0 64m 192.168.61.155 pippin <none> <none>
resize-d8cd49f6c-7795n 1/1 Running 0 64m 172.17.0.8 merry <none> <none>
resize-d8cd49f6c-jvw49 1/1 Running 0 64m 192.168.61.156 pippin <none> <none>
resize-d8cd49f6c-mml47 1/1 Running 0 64m 192.168.61.157 pippin <none> <none>
resize-d8cd49f6c-qpkpk 1/1 Running 0 64m 172.17.0.6 merry <none> <none>
resize-d8cd49f6c-t4t8z 1/1 Running 0 64m 172.17.0.5 merry <none> <none>
resize-d8cd49f6c-vmpkp 1/1 Running 0 64m 172.17.0.7 merry <none> <none>
I got some pods running on Pippin and others on Merry. Unfortunately the 4 pods scheduled on Merry don't get any pod when the load is generated:
NAME CPU(cores) MEMORY(bytes)
locust-77c699c94d-k8ssz 873m 82Mi
resize-d8cd49f6c-2tk62 71m 104Mi
resize-d8cd49f6c-6g2f9 67m 107Mi
resize-d8cd49f6c-7795n 0m 31Mi
resize-d8cd49f6c-jvw49 78m 104Mi
resize-d8cd49f6c-mml47 73m 105Mi
resize-d8cd49f6c-qpkpk 0m 32Mi
resize-d8cd49f6c-t4t8z 0m 31Mi
resize-d8cd49f6c-vmpkp 0m 31Mi
Do you know why this is happening? and what I can check for solve this issue?
Do you know why the IP Address of pods are different on nodes even if I used the --pod-network-cidr=192.168.0.0/24 ?
thanks for who can help me!
The pods which got deployed on master node "merry" are in running status so there no issue. For your other query why master node has different CIDR values if you have jq installed run "kubectl get node merry -o json | jq '.spec.podCIDR' which will give cidr values used. or you can describe master node.

Kubernetes nslookup kubernetes.default fails

My Environment:
OS - CentOS-8.2
Kubernetes Vesion:
Client Version: v1.18.8
Server Version: v1.18.8
I have successfully configured Kubernetes cluster (One master & one worker), But currently while checking the dns resolution with below code it is failing.
apiVersion: v1
kind: Pod
metadata:
name: dnsutils
namespace: default
spec:
containers:
- name: dnsutils
image: gcr.io/kubernetes-e2e-test-images/dnsutils:1.3
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
restartPolicy: Always
# kubectl get pods -o wide --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default dnsutils 1/1 Running 0 4m38s 10.244.1.20 K8s-Worker-1 <none> <none>
kube-system coredns-66bff467f8-2q4z9 1/1 Running 1 4d14h 10.244.0.5 K8s-Master <none> <none>
kube-system coredns-66bff467f8-ktbd4 1/1 Running 1 4d14h 10.244.0.4 K8s-Master <none> <none>
kube-system etcd-K8s-Master 1/1 Running 1 4d14h 65.66.67.5 K8s-Master <none> <none>
kube-system kube-apiserver-K8s-Master 1/1 Running 1 4d14h 65.66.67.5 K8s-Master <none> <none>
kube-system kube-controller-manager-K8s-Master 1/1 Running 1 4d14h 65.66.67.5 K8s-Master <none> <none>
kube-system kube-flannel-ds-amd64-d6h9c 1/1 Running 61 45h 65.66.67.6 K8s-Worker-1 <none> <none>
kube-system kube-flannel-ds-amd64-tc4qf 1/1 Running 202 4d14h 65.66.67.5 K8s-Master <none> <none>
kube-system kube-proxy-cl9n4 1/1 Running 0 45h 65.66.67.6 K8s-Worker-1 <none> <none>
kube-system kube-proxy-s7jlc 1/1 Running 1 4d14h 65.66.67.5 K8s-Master <none> <none>
kube-system kube-scheduler-K8s-Master 1/1 Running 1 4d14h 65.66.67.5 K8s-Master <none> <none>
# kubectl get pods
NAME READY STATUS RESTARTS AGE
dnsutils 1/1 Running 0 22m
Currently below commands executed on Kubernetes cluster master and nslookup kubernetes.default is failing.
# kubectl exec -i -t dnsutils -- nslookup kubernetes.default
;; connection timed out; no servers could be reached
command terminated with exit code 1
# kubectl exec -ti dnsutils -- cat /etc/resolv.conf
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local company.domain.com
options ndots:5
# kubectl get pods --namespace=kube-system -l k8s-app=kube-dns
NAME READY STATUS RESTARTS AGE
coredns-66bff467f8-2q4z9 1/1 Running 1 4d14h
coredns-66bff467f8-ktbd4 1/1 Running 1 4d14h
# kubectl logs --namespace=kube-system -l k8s-app=kube-dns
.:53
[INFO] plugin/reload: Running configuration MD5 = 4e235fcc3696966e76816bcd9034ebc7
CoreDNS-1.6.7
linux/amd64, go1.13.6, da7f65b
.:53
[INFO] plugin/reload: Running configuration MD5 = 4e235fcc3696966e76816bcd9034ebc7
CoreDNS-1.6.7
linux/amd64, go1.13.6, da7f65b
# kubectl get svc --namespace=kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 4d14h
# kubectl get endpoints kube-dns --namespace=kube-system
NAME ENDPOINTS AGE
kube-dns 10.244.0.4:53,10.244.0.5:53,10.244.0.4:9153 + 3 more... 4d14h
# kubectl describe svc -n kube-system kube-dns
Name: kube-dns
Namespace: kube-system
Labels: k8s-app=kube-dns
kubernetes.io/cluster-service=true
kubernetes.io/name=KubeDNS
Annotations: prometheus.io/port: 9153
prometheus.io/scrape: true
Selector: k8s-app=kube-dns
Type: ClusterIP
IP: 10.96.0.10
Port: dns 53/UDP
TargetPort: 53/UDP
Endpoints: 10.244.0.4:53,10.244.0.5:53
Port: dns-tcp 53/TCP
TargetPort: 53/TCP
Endpoints: 10.244.0.4:53,10.244.0.5:53
Port: metrics 9153/TCP
TargetPort: 9153/TCP
Endpoints: 10.244.0.4:9153,10.244.0.5:9153
Session Affinity: None
Events: <none>
# kubectl describe svc kubernetes
Name: kubernetes
Namespace: default
Labels: component=apiserver
provider=kubernetes
Annotations: <none>
Selector: <none>
Type: ClusterIP
IP: 10.96.0.1
Port: https 443/TCP
TargetPort: 6443/TCP
Endpoints: 65.66.67.5:6443
Session Affinity: None
Events: <none>
Can anyone please help me to debug this issue. Thanks.
I have uninstalled and re-installed Kubernetes version - v1.19.0 Now everything working fine. Thanks.

Rancher: kube-system pods stuck on ContainerCreating

I'm trying to spin up a cluster with one node (VM machine) but I'm getting some pods for kube-system stuck as ContainerCreating
> kubectl get pods,svc -owide --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
cattle-system pod/cattle-cluster-agent-7db88c6b68-bz5dp 0/1 ContainerCreating 0 7m13s <none> hdn-dev-app66 <none> <none>
cattle-system pod/cattle-node-agent-ccntw 1/1 Running 0 7m13s 10.105.1.76 hdn-dev-app66 <none> <none>
cattle-system pod/kube-api-auth-9kdpw 1/1 Running 0 7m13s 10.105.1.76 hdn-dev-app66 <none> <none>
ingress-nginx pod/default-http-backend-598b7d7dbd-rwvhm 0/1 ContainerCreating 0 7m29s <none> hdn-dev-app66 <none> <none>
ingress-nginx pod/nginx-ingress-controller-62vhq 1/1 Running 0 7m29s 10.105.1.76 hdn-dev-app66 <none> <none>
kube-system pod/coredns-849545576b-w87zr 0/1 ContainerCreating 0 7m39s <none> hdn-dev-app66 <none> <none>
kube-system pod/coredns-autoscaler-5dcd676cbd-pj54d 0/1 ContainerCreating 0 7m38s <none> hdn-dev-app66 <none> <none>
kube-system pod/kube-flannel-d9m6q 2/2 Running 0 7m43s 10.105.1.76 hdn-dev-app66 <none> <none>
kube-system pod/metrics-server-697746ff48-q7cpx 0/1 ContainerCreating 0 7m33s <none> hdn-dev-app66 <none> <none>
kube-system pod/rke-coredns-addon-deploy-job-npjll 0/1 Completed 0 7m40s 10.105.1.76 hdn-dev-app66 <none> <none>
kube-system pod/rke-ingress-controller-deploy-job-b9rs4 0/1 Completed 0 7m30s 10.105.1.76 hdn-dev-app66 <none> <none>
kube-system pod/rke-metrics-addon-deploy-job-5rpbj 0/1 Completed 0 7m35s 10.105.1.76 hdn-dev-app66 <none> <none>
kube-system pod/rke-network-plugin-deploy-job-lvk2q 0/1 Completed 0 7m50s 10.105.1.76 hdn-dev-app66 <none> <none>
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
default service/kubernetes ClusterIP 10.43.0.1 <none> 443/TCP 8m19s <none>
ingress-nginx service/default-http-backend ClusterIP 10.43.144.25 <none> 80/TCP 7m29s app=default-http-backend
kube-system service/kube-dns ClusterIP 10.43.0.10 <none> 53/UDP,53/TCP,9153/TCP 7m39s k8s-app=kube-dns
kube-system service/metrics-server ClusterIP 10.43.251.47 <none> 443/TCP 7m34s k8s-app=metrics-server
when I will do describe on failing pods I'm getting that:
Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "345460c8f6399a0cf20956d8ea24d52f5a684ae47c3e8ec247f83d66d56b2baa" network for pod "cattle-cluster-agent-7db88c6b68-bz5dp": networkPlugin cni failed to set up pod "cattle-cluster-agent-7db88c6b68-bz5dp_cattle-system" network: error getting ClusterInformation: connection is unauthorized: clusterinformations.crd.projectcalico.org "default" is forbidden: User "system:node" cannot get resource "clusterinformations" in API group "crd.projectcalico.org" at the cluster scope, failed to clean up sandbox container "345460c8f6399a0cf20956d8ea24d52f5a684ae47c3e8ec247f83d66d56b2baa" network for pod "cattle-cluster-agent-7db88c6b68-bz5dp": networkPlugin cni failed to teardown pod "cattle-cluster-agent-7db88c6b68-bz5dp_cattle-system" network: error getting ClusterInformation: connection is unauthorized: clusterinformations.crd.projectcalico.org "default" is forbidden: User "system:node" cannot get resource "clusterinformations" in API group "crd.projectcalico.org" at the cluster scope]
Had try to re-registry that node once more time but no luck. Any thoughts?
As it says unauthorized so you have to give rbac permissions to make it work.
Try adding
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:calico-node
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: calico-node
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: system:nodes
Fixed problem with following article from https://rancher.com/docs/rancher/v2.x/en/cluster-admin/cleaning-cluster-nodes/ on how to recycle broken node.