I created a Single node Kuberenetes cluster using minikube and I installed helm on that. But I am getting issue while executing helm ls and helm install commands. This is this issue I am facing:
"Get http://localhost:8080/api/v1/namespaces/kube-system/configmaps?labelSelector=OWNER%!D(MISSING)TILLER: dial tcp 127.0.0.1:8080: connect: connection refused".
These are pods are running on kube-system namespace
ubuntu#openshift:~$ kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
default-http-backend-vqbh4 1/1 Running 1 6h
etcd-minikube 1/1 Running 0 6h
kube-addon-manager-minikube 1/1 Running 4 1d
kube-apiserver-minikube 1/1 Running 0 6h
kube-controller-manager-minikube 1/1 Running 0 6h
kube-dns-86f4d74b45-xxznk 3/3 Running 15 1d
kube-proxy-j28zs 1/1 Running 0 6h
kube-scheduler-minikube 1/1 Running 3 1d
kubernetes-dashboard-5498ccf677-89hrf 1/1 Running 8 1d
nginx-ingress-controller-tjljg 1/1 Running 3 6h
registry-wzwnq 1/1 Running 1 7h
storage-provisioner 1/1 Running 8 1d
tiller-deploy-75d848bb9-tmm9b 1/1 Running 0 4h
If you have any idea please help me. Thanks
Related
UPDATE 1:
Some more logs from api-servers:
https://gist.github.com/nvcnvn/47df8798e798637386f6e0777d869d4f
This question is more about debugging method for current GKE but welcome for solution.
We're using GKE version 1.22.3-gke.1500 with following configuration:
We recently facing issue that commands like kubectl logs and exec doesn't work, deleting a namespace taking forever.
Checking some service inside the cluster, it seem for some reason some network operation just randomly failed. For example metric-server keep crashing with these error logs:
message: "pkg/mod/k8s.io/client-go#v0.19.10/tools/cache/reflector.go:156: Failed to watch *v1.Node: failed to list *v1.Node: Get "https://10.97.0.1:443/api/v1/nodes?resourceVersion=387681528": net/http: TLS handshake timeout"
HTTP request timeout also:
unable to fully scrape metrics: unable to fully scrape metrics from node gke-staging-n2d-standard-8-78c35b3a-6h16: unable to fetch metrics from node gke-staging-n2d-standard-8-78c35b3a-6h16: Get "http://10.148.15.217:10255/stats/summary?only_cpu_and_memory=true": context deadline exceeded
and I also try to restart (by kubectl delete) most of the pod in this list:
kubectl get pod
NAME READY STATUS RESTARTS AGE
event-exporter-gke-5479fd58c8-snq26 2/2 Running 0 4d7h
fluentbit-gke-gbs2g 2/2 Running 0 4d7h
fluentbit-gke-knz2p 2/2 Running 0 85m
fluentbit-gke-ljw8h 2/2 Running 0 30h
gke-metadata-server-dtnvh 1/1 Running 0 4d7h
gke-metadata-server-f2bqw 1/1 Running 0 30h
gke-metadata-server-kzcv6 1/1 Running 0 85m
gke-metrics-agent-4g56c 1/1 Running 12 (3h6m ago) 4d7h
gke-metrics-agent-hnrll 1/1 Running 13 (13h ago) 30h
gke-metrics-agent-xdbrw 1/1 Running 0 85m
konnectivity-agent-87bc84bb7-g9nd6 1/1 Running 0 2m59s
konnectivity-agent-87bc84bb7-rkhhh 1/1 Running 0 3m51s
konnectivity-agent-87bc84bb7-x7pk4 1/1 Running 0 3m50s
konnectivity-agent-autoscaler-698b6d8768-297mh 1/1 Running 0 83m
kube-dns-77d9986bd5-2m8g4 4/4 Running 0 3h24m
kube-dns-77d9986bd5-z4j62 4/4 Running 0 3h24m
kube-dns-autoscaler-f4d55555-dmvpq 1/1 Running 0 83m
kube-proxy-gke-staging-n2d-standard-8-78c35b3a-8299 1/1 Running 0 11s
kube-proxy-gke-staging-n2d-standard-8-78c35b3a-fp5u 1/1 Running 0 11s
kube-proxy-gke-staging-n2d-standard-8-78c35b3a-rkdp 1/1 Running 0 11s
l7-default-backend-7db896cb4-mvptg 1/1 Running 0 83m
metrics-server-v0.4.4-fd9886cc5-tcscj 2/2 Running 82 33h
netd-5vpmc 1/1 Running 0 30h
netd-bhq64 1/1 Running 0 85m
netd-n6jmc 1/1 Running 0 4d7h
Some logs from metrics server
https://gist.github.com/nvcnvn/b77eb02705385889961aca33f0f841c7
if you cannot use kubectl to get info from your cluster, can you try to access them by using their restfull api
http://blog.madhukaraphatak.com/understanding-k8s-api-part-2/
try to delete "metric-server" pods or get logs from it using podman or curl command.
I am running Kubernetes on bare metal and use Kubernets dashboard to manage the cluster. This functions fine at first, but after 5-30 minutes when I try to access the dashboard at:
http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/
I get the following error:
Error: 'dial tcp 10.35.0.19:8443: connect: no route to host'
Trying to reach: 'https://10.35.0.19:8443/'
All pods in kube-system are up and running if I check them with kubectl get pods -n kube-system:
NAME READY STATUS RESTARTS AGE
coredns-86c58d9df4-87pfc 1/1 Running 0 1m
coredns-86c58d9df4-tflg5 1/1 Running 0 1m
etcd-controller01 1/1 Running 5 1m
etcd-controller02 1/1 Running 6 1m
heapster-798ffb9b4-744q4 1/1 Running 0 1m
kube-apiserver-controller01 1/1 Running 1 1m
kube-apiserver-controller02 1/1 Running 3 1m
kube-controller-manager-controller01 1/1 Running 5 1m
kube-controller-manager-controller02 1/1 Running 2 1m
kube-proxy-8qqnq 1/1 Running 0 1m
kube-proxy-9vgck 1/1 Running 0 1m
kube-proxy-dht69 1/1 Running 0 1m
kube-proxy-f7bx8 1/1 Running 0 1m
kube-proxy-jnxtq 1/1 Running 0 1m
kube-proxy-l5h7m 1/1 Running 0 1m
kube-proxy-p9gt5 1/1 Running 0 1m
kube-proxy-zv4sr 1/1 Running 0 1m
kube-scheduler-controller01 1/1 Running 3 1m
kube-scheduler-controller02 1/1 Running 4 1m
kubernetes-dashboard-57df4db6b-px8xc 1/1 Running 0 1m
metrics-server-55d46868d4-s9j5v 1/1 Running 0 1m
monitoring-grafana-564f579fd4-fm6lm 1/1 Running 0 1m
monitoring-influxdb-8b7d57f5c-llgz9 1/1 Running 0 1m
weave-net-2b2dm 2/2 Running 1 1m
weave-net-988rf 2/2 Running 0 1m
weave-net-hcm5n 2/2 Running 0 1m
weave-net-kb2gk 2/2 Running 0 1m
weave-net-ksvbf 2/2 Running 0 1m
weave-net-q9zlw 2/2 Running 0 1m
weave-net-t9f6m 2/2 Running 0 1m
weave-net-vdspp 2/2 Running 0 1m
When I restart all pods in this namespace with kubectl delete pods --all -n kube-system the dashboard sometimes works again for 5-30 minutes and at other times it randomly starts working again out of itself. I have tried restarting each pod in this namespace individually to try and track down which pod is causing this issue but restarting the pods one by one does not get the dashboard up again. Only the delete all at once command works.
Does anybody have an idea why this happens and how I can fix this?
Thank you in advance!
I have a 2 node kubernetes cluster with calico networking. All the pods are up and running.
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-etcd-94466 1/1 Running 0 21h
kube-system calico-kube-controllers-5fdcfdbdf7-xsjxb 1/1 Running 0 14d
kube-system calico-node-hmnf5 2/2 Running 0 14d
kube-system calico-node-vmmmk 2/2 Running 0 14d
kube-system coredns-78fcdf6894-dlqg6 1/1 Running 0 14d
kube-system coredns-78fcdf6894-zwrd6 1/1 Running 0 14d
kube-system etcd-kube-master-01 1/1 Running 0 14d
kube-system kube-apiserver-kube-master-01 1/1 Running 0 14d
kube-system kube-controller-manager-kube-master-01 1/1 Running 0 14d
kube-system kube-proxy-nxfht 1/1 Running 0 14d
kube-system kube-proxy-qnn45 1/1 Running 0 14d
kube-system kube-scheduler-kube-master-01 1/1 Running 0 14d
I wanted to query calico-etcd using etcdctl, but I get the following error.
# etcdctl --debug --endpoints "http://10.142.137.11:6666" get calico
start to sync cluster using endpoints(http://10.142.137.11:6666)
cURL Command: curl -X GET http://10.142.137.11:6666/v2/members
got endpoints(http://10.142.137.11:6666) after sync
Cluster-Endpoints: http://10.142.137.11:6666
cURL Command: curl -X GET http://10.142.137.11:6666/v2/keys/calico?quorum=false&recursive=false&sorted=false
Error: 100: Key not found (/calico) [4]
Any pointers on why I get this error?
As #JakubBujny mentioned, ETCDCTL_API=3 should be set to get the appropriate result.
I'm trying to install openstack-helm on a bare-metal. While installing ingress pods, the ingress pod with openstack namespace is coming up successfully but the ingress pod of kube-system namespace is going to crashloopbackoff. The following are the list of pods
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system calico-etcd-lbxpt 1/1 Running 0 3d 172.17.0.1
kube-system calico-kube-policy-controllers-64b8674d9-dwq8k 1/1 Running 0 3d 172.17.0.1
kube-system calico-node-sfq89 2/2 Running 0 3d 172.17.0.1
kube-system etcd-megam-1 1/1 Running 0 3d 172.17.0.1
kube-system ingress-error-pages-6bfc8d875-tmqvd 1/1 Running 0 3d 192.168.232.131
**kube-system ingress-rbv77 0/1 CrashLoopBackOff 1117** 3d 172.17.0.1
kube-system kube-apiserver-megam-1 1/1 Running 0 3d 172.17.0.1
kube-system kube-controller-manager-megam-1 1/1 Running 0 3d 172.17.0.1
kube-system kube-dns-85648cfc65-pq44v 3/3 Running 0 3d 192.168.232.129
kube-system kube-proxy-n9zts 1/1 Running 0 3d 172.17.0.1
kube-system kube-scheduler-megam-1 1/1 Running 0 3d 172.17.0.1
kube-system tiller-deploy-5c9c77f7c-bclsk 1/1 Running 0 3d 192.168.232.130
openstack ingress-c7f9b544c-984wk 1/1 Running 0 3d 192.168.232.132
openstack ingress-error-pages-57957f47f-25dgr 1/1 Running 0 3d 192.168.232.133
Even the nova pods are not coming up. Can anyone help how to fix this?
v1.8.2,installed by kubeadm
2 node:
NAME STATUS ROLES AGE VERSION
192-168-99-102.node Ready <none> 8h v1.8.2
192-168-99-108.master Ready master 8h v1.8.2
run nginx to test:
NAME READY STATUS RESTARTS AGE IP NODE
curl-6896d87888-smvjm 1/1 Running 0 7h 10.244.1.99 192-168-99-102.node
nginx-fbb985966-5jbxd 1/1 Running 0 7h 10.244.1.94 192-168-99-102.node
nginx-fbb985966-8vp9g 1/1 Running 0 8h 10.244.1.93 192-168-99-102.node
nginx-fbb985966-9bqzh 1/1 Running 1 7h 10.244.0.85 192-168-99-108.master
nginx-fbb985966-fd22h 1/1 Running 1 7h 10.244.0.83 192-168-99-108.master
nginx-fbb985966-lmgmf 1/1 Running 0 7h 10.244.1.98 192-168-99-102.node
nginx-fbb985966-lr2rh 1/1 Running 0 7h 10.244.1.96 192-168-99-102.node
nginx-fbb985966-pm2p7 1/1 Running 0 7h 10.244.1.97 192-168-99-102.node
nginx-fbb985966-t6d8b 1/1 Running 0 7h 10.244.1.95 192-168-99-102.node
kubectl exec pod on master is OK!
but when i exec pod on other node,return a error:
kubectl exec -it nginx-fbb985966-8vp9g bash
error: unable to upgrade connection: pod does not exist