Failed to create SubnetManager: asn1: structure error: tags don't match - kubernetes

While this error message is just a symptom, my problem is real.
My bare-metal cluster experienced a certificate expired situation. i managed to renew all certificates, but upon restart, most pods wouldn't work. the pod that seems responsible, is the flannel one (crashloopbackoff).
logs for the flannel pods show
I1120 22:24:00.541277 1 main.go:475] Determining IP address of default interface
I1120 22:24:00.541546 1 main.go:488] Using interface with name eth0 and address xxx.xxx.xxx.xxx
I1120 22:24:00.541565 1 main.go:505] Defaulting external address to interface address (xxx.xxx.xxx.xxx)
E1120 22:24:03.572745 1 main.go:232] Failed to create SubnetManager: error retrieving pod spec for 'kube-system/kube-flannel-ds-amd64-dmrzh': Get https://10.96.0.1:443/api/v1/namespaces/kube-system/pods/kube-flannel-ds-amd64-dmrzh: dial tcp 10.96.0.1:443: getsockopt: network is unreachable
on the host there is not even a flannel interface anymore. neither a systemd file
running flanneld manually yields this output
I1120 20:12:15.923966 26361 main.go:446] Determining IP address of default interface
I1120 20:12:15.924171 26361 main.go:459] Using interface with name eth0 and address xxx.xxx.xxx.xxx
I1120 20:12:15.924187 26361 main.go:476] Defaulting external address to interface address (xxx.xxx.xxx.xxx)
E1120 20:12:15.924285 26361 main.go:223] Failed to create SubnetManager: asn1: structure error: tags don't match (16 vs {class:0 tag:2 length:1 isCompound:false}) {optional:false explicit:false application:false defaultValue:<nil> tag:<nil> stringType:0 timeType:0 set:false omitEmpty:false} tbsCertificate #2
the available pieces of evidence point in several directions, but upon checking those out, it points somewehre else. so i need a pointer to which part causes the problem.
is it etcd?
is it the new etcd certificate?
is it the missing flannel-interface?
is it the non-operational flanneld?
is it something not listed here?
if there is information missing here, please ask, i can surely provide.
key specs:
- host: ubuntu 18.04
- kubeadm 1.13.2
thank you and best regards,
scones
UPDATE1
$ k get cs,po,svc
NAME STATUS MESSAGE ERROR
componentstatus/controller-manager Healthy ok
componentstatus/scheduler Healthy ok
componentstatus/etcd-0 Healthy {"health": "true"}
NAME READY STATUS RESTARTS AGE
pod/cert-manager-6dc5c68468-hkb6j 0/1 Error 51 89d
pod/coredns-86c58d9df4-dtdxq 0/1 Completed 23 304d
pod/coredns-86c58d9df4-k7h7m 0/1 Completed 23 304d
pod/etcd-redacted 1/1 Running 2506 304d
pod/hostpath-provisioner-5c6754fbd4-ckvnp 0/1 Error 12 222d
pod/kube-apiserver-redacted 1/1 Running 1907 304d
pod/kube-controller-manager-redacted 1/1 Running 2682 304d
pod/kube-flannel-ds-amd64-dmrzh 0/1 CrashLoopBackOff 338 372d
pod/kube-proxy-q8jgs 1/1 Running 15 304d
pod/kube-scheduler-redacted 1/1 Running 2694 304d
pod/metrics-metrics-server-65cd865c9f-dbh85 0/1 Error 2658 120d
pod/tiller-deploy-865b88d89-8ftzs 0/1 Error 170 305d
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP 372d
service/metrics-metrics-server ClusterIP 10.97.186.19 <none> 443/TCP 120d
service/tiller-deploy ClusterIP 10.103.184.226 <none> 44134/TCP 354d
unfortunately i don't recall how i installed flannel a year ago.
kubectl version is also 1.13.2, as is the cluster
the linked post by #hanx is about renewing certificated, not broken network overlays, so not applicable.

Related

mac M1 minikube service tunnel doesn't work

with newest Docker desktop and minikube on Mac m1
I have tried both Nodeport and LoadBalancer with tunnel on minikube's docs: https://minikube.sigs.k8s.io/docs/handbook/accessing/ neither works, I cannot curl or access app from my browser.
Essentially, the port, and url that tunnel gives me from minikube doesn't work and I get "can not reach" in the browser.
❯ k get all
NAME READY STATUS RESTARTS AGE
pod/shuffle-depl-5746c5976f-qg7wf 1/1 Running 0 26m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/shuffle-depl LoadBalancer 10.106.74.89 <pending> 5000:31170/TCP 17m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/shuffle-depl 1/1 1 1 26m
NAME DESIRED CURRENT READY AGE
replicaset.apps/shuffle-depl-5746c5976f 1 1 1 26m
However port forwarding to pod, and port forwarding to LB service works from here.
But I'm not satisfied with the solution. Is there no good solution for this, and what is underneath that's causing this issue?

CoreDNS pods stuck in ContainerCreating - Kubernetes

I am still new to Kubernetes and I was trying to set up a cluster on bare metal servers according to the official docu.
Right now I am running a one worker and one master node configuration, but I am struggling to run all the pods once the cluster initializes. The main problem is the coredns pods, that are stuck in the ContainerCreating state.
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-78fcd69978-4vtsp 0/1 ContainerCreating 0 5s
kube-system coredns-78fcd69978-wtn2c 0/1 ContainerCreating 0 12h
kube-system etcd-dcpoth24213118 1/1 Running 4 12h
kube-system kube-apiserver-dcpoth24213118 1/1 Running 0 12h
kube-system kube-controller-manager-dcpoth24213118 1/1 Running 0 12h
kube-system kube-proxy-8282p 1/1 Running 0 12h
kube-system kube-scheduler-dcpoth24213118 1/1 Running 0 12h
kube-system weave-net-6zz2j 2/2 Running 0 12h
After checking the logs I've noticed this error. The problem is I don't really know what the error is refering to.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 19s default-scheduler Successfully assigned kube-system/coredns-78fcd69978-4vtsp to dcpoth24213118
Warning FailedCreatePodSandBox 13s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "2521c9dd723f3fc50b3510791a8c35cbc9ec19768468eb3da3367274a4dfcbba" network for pod "coredns-78fcd69978-4vtsp": networkPlugin cni failed to set up pod "coredns-78fcd69978-4vtsp_kube-system" network: error getting ClusterInformation: Get "https://[10.43.0.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default": dial tcp 10.43.0.1:443: connect: no route to host, failed to clean up sandbox container "2521c9dd723f3fc50b3510791a8c35cbc9ec19768468eb3da3367274a4dfcbba" network for pod "coredns-78fcd69978-4vtsp": networkPlugin cni failed to teardown pod "coredns-78fcd69978-4vtsp_kube-system" network: error getting ClusterInformation: Get "https://[10.43.0.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default": dial tcp 10.43.0.1:443: connect: no route to host]
Normal SandboxChanged 10s (x2 over 12s) kubelet Pod sandbox changed, it will be killed and re-created.
I've running the kuberenetes cluster behind a corporate proxy. I've set the environmental variables as follows.
export https_proxy=http://proxyIP:PORT
export http_proxy=http://proxyIP:PORT
export HTTP_PROXY="${http_proxy}"
export HTTPS_PROXY="${https_proxy}"
export NO_PROXY=localhost,127.0.0.1,master_node_IP,worker_node_IP,10.0.0.0/8,10.96.0.0/16
[root#dcpoth24213118 ~]# kubectl get svc -A
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 12h
kube-system kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 12h
[root#dcpoth24213118 ~]# ip r s
default via 6.48.248.129 dev eth1
6.48.248.128/26 dev eth1 proto kernel scope link src 6.48.248.145
10.32.0.0/12 dev weave proto kernel scope link src 10.32.0.1
10.155.0.0/24 via 6.48.248.129 dev eth1
10.228.0.0/24 via 6.48.248.129 dev eth1
10.229.0.0/24 via 6.48.248.129 dev eth1
10.250.0.0/24 via 6.48.248.129 dev eth1
I've got weave network plugin installed. The issue is that I cannot create any other pods, all will get stuck in the ContainerCreating state.
I've run out of ideas how to fix it. Can someone give me a hint ?

How do I connect to my local kubernetes CockroachDB with DBeaver?

I have a Minikube Kubernetes cluster running a cockroachdb which looks like:
kubectl get pods
test-cockroachdb-0 1/1 Running 17 95m
test-cockroachdb-1 1/1 Running 190 2d
test-cockroachdb-2 1/1 Running 160 2d
test-cockroachdb-init-m8rzp 0/1 Completed 0 2d
cockroachdb-client-secure 1/1 Running 0 2d
I want to get a connection string that I can use in my application.
To verify my connection string, I am using the tool DBeaver.
My database name is configured to 'defaultdb' which exists on my cluster, and the user with the relevant password. The port is accurate as well (default cockroachdb minikube port).
However as to the certificate aspect of connecting I am at a loss. How do I generate/gather the certificates I need to successfully connect to my cluster? How do I connect to my cluster using DBeaver?
Edit:
$ kubectl get all
NAME READY STATUS RESTARTS AGE
pod/myname-cockroachdb-0 1/1 Running 27 156m
pod/myname-cockroachdb-1 1/1 Running 197 2d1h
pod/myname-cockroachdb-2 1/1 Running 167 2d1h
pod/myname-cockroachdb-init-m8rzp 0/1 Completed 0 2d1h
pod/myname-client-secure 1/1 Running 0 2d1h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/myname-cockroachdb ClusterIP None <none> 26257/TCP,8080/TCP 2d1h
service/myname-cockroachdb-public ClusterIP 10.xxx.xxx.xx <none> 26257/TCP,8080/TCP 2d1h
service/kubernetes ClusterIP 10.xx.0.1 <none> 443/TCP 2d1h
NAME READY AGE
statefulset.apps/myname-cockroachdb 3/3 2d1h
NAME COMPLETIONS DURATION AGE
job.batch/myname-cockroachdb-init 1/1 92s 2d1h
Like #FL3SH already said.
You can use kubectl port-forward <pod_name> <port>
This is nicely explained in Cockroach documentation Step 4. Access the Admin UI, please us it as example and set different ports.
As for the certificates:
As each pod is created, it issues a Certificate Signing Request, or CSR, to have the node's certificate signed by the Kubernetes CA. You must manually check and approve each node's certificates, at which point the CockroachDB node is started in the pod.
Get the name of the Pending CSR for the first pod:
kubectl get csr
NAME AGE REQUESTOR CONDITION
default.node.cockroachdb-0 1m system:serviceaccount:default:default Pending
node-csr-0Xmb4UTVAWMEnUeGbW4KX1oL4XV_LADpkwjrPtQjlZ4 4m kubelet Approved,Issued
node-csr-NiN8oDsLhxn0uwLTWa0RWpMUgJYnwcFxB984mwjjYsY 4m kubelet Approved,Issued
node-csr-aU78SxyU69pDK57aj6txnevr7X-8M3XgX9mTK0Hso6o 5m kubelet Approved,Issued
If you do not see a Pending CSR, wait a minute and try again.
You can check the CSR pod kubectl describe csr default.node.cockroachdb-0
It might look like this:
Name: default.node.cockroachdb-0
Labels: <none>
Annotations: <none>
CreationTimestamp: Thu, 09 Nov 2017 13:39:37 -0500
Requesting User: system:serviceaccount:default:default
Status: Pending
Subject:
Common Name: node
Serial Number:
Organization: Cockroach
Subject Alternative Names:
DNS Names: localhost
cockroachdb-0.cockroachdb.default.svc.cluster.local
cockroachdb-public
IP Addresses: 127.0.0.1
10.48.1.6
Events: <none>
If it does then you can approve the certificate using:
kubectl certificate approve default.node.cockroachdb-0
Please do follow the Orchestrate CockroachDB in a Single Kubernetes Cluster guide.
Let me know if you need any further help.
You can use kubectl port-forward service/myname-cockroachdb 26257 and in DBeaver just use localhost:26257 as a connection string.

Why doesn't kube-proxy route traffic to another worker node?

I've deployed several different services and always get the same error.
The service is reachable on the node port from the machine where the pod is running. On the two other nodes I get timeouts.
The kube-proxy is running on all worker nodes and I can see in the logfiles from kube-proxy that the service port was added and the node port was opened.
In this case I've deployed the stars demo from calico
Kube-proxy log output:
Mar 11 10:25:10 kuben1 kube-proxy[659]: I0311 10:25:10.229458 659 service.go:309] Adding new service port "management-ui/management-ui:" at 10.32.0.133:9001/TCP
Mar 11 10:25:10 kuben1 kube-proxy[659]: I0311 10:25:10.257483 659 proxier.go:1427] Opened local port "nodePort for management-ui/management-ui:" (:30002/tcp)
The kube-proxy is listening on the port 30002
root#kuben1:/tmp# netstat -lanp | grep 30002
tcp6 0 0 :::30002 :::* LISTEN 659/kube-proxy
There are also some iptable rules defined:
root#kuben1:/tmp# iptables -L -t nat | grep management-ui
KUBE-MARK-MASQ tcp -- anywhere anywhere /* management-ui/management-ui: */ tcp dpt:30002
KUBE-SVC-MIYW5L3VT4JVLCIZ tcp -- anywhere anywhere /* management-ui/management-ui: */ tcp dpt:30002
KUBE-MARK-MASQ tcp -- !10.200.0.0/16 10.32.0.133 /* management-ui/management-ui: cluster IP */ tcp dpt:9001
KUBE-SVC-MIYW5L3VT4JVLCIZ tcp -- anywhere 10.32.0.133 /* management-ui/management-ui: cluster IP */ tcp dpt:9001
The interesting part is that I can reach the service IP from any worker node
root#kubem1:/tmp# kubectl get svc -n management-ui
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
management-ui NodePort 10.32.0.133 <none> 9001:30002/TCP 52m
The service IP/port can be accessed from any worker node if I do a "curl http://10.32.0.133:9001"
I don't understand why kube-proxy does not "route" this properly...
Has anyone a hint where I can find the error?
Here some cluster specs:
This is a hand build cluster inspired by Kelsey Hightower's "kubernetes the hard way" guide.
6 Nodes (3 master: 3 worker) local vms
OS: Ubuntu 18.04
K8s: v1.13.0
Docker: 18.9.3
Cni: calico
Component status on the master nodes looks okay
root#kubem1:/tmp# kubectl get componentstatus
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health":"true"}
etcd-1 Healthy {"health":"true"}
etcd-2 Healthy {"health":"true"}
The worker nodes are looking okay if I trust kubectl
root#kubem1:/tmp# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
kuben1 Ready <none> 39d v1.13.0 192.168.178.77 <none> Ubuntu 18.04.2 LTS 4.15.0-46-generic docker://18.9.3
kuben2 Ready <none> 39d v1.13.0 192.168.178.78 <none> Ubuntu 18.04.2 LTS 4.15.0-46-generic docker://18.9.3
kuben3 Ready <none> 39d v1.13.0 192.168.178.79 <none> Ubuntu 18.04.2 LTS 4.15.0-46-generic docker://18.9.3
As asked by P Ekambaram:
root#kubem1:/tmp# kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
calico-node-bgjdg 1/1 Running 5 40d
calico-node-nwkqw 1/1 Running 5 40d
calico-node-vrwn4 1/1 Running 5 40d
coredns-69cbb76ff8-fpssw 1/1 Running 5 40d
coredns-69cbb76ff8-tm6r8 1/1 Running 5 40d
kubernetes-dashboard-57df4db6b-2xrmb 1/1 Running 5 40d
I've found a solution for my "Problem".
This behavior was caused by a change in Docker v1.13.x and the issue was fixed in kubernetes with version 1.8.
The easy solution was to change the forward rules via iptables.
Run the following cmd on all worker nodes: "iptables -A FORWARD -j ACCEPT"
To fix it the right way i had to tell the kube-proxy the cidr for the pods.
Theoretical that could be solved in two ways:
Add "--cluster-cidr=10.0.0.0/16" as argument to the kube-proxy command line(in my case in the systemd service file)
Add 'clusterCIDR: "10.0.0.0/16"' to the kubeconfig file for kube-proxy
In my case the cmd line argument doesn't had any effect.
As i've added the line to my kubeconfig file and restarted the kube-proxy on all worker nodes everything works well.
Here is the github merge request for this "FORWARD" issue: link

ipvsadm not showing any entry in kubeadm cluster

I have installed kubeadm and created service and pod:
packet#test:~$ kubectl get pod
NAME READY STATUS RESTARTS AGE
udp-server-deployment-6f87f5c9-466ft 1/1 Running 0 5m
udp-server-deployment-6f87f5c9-5j9rt 1/1 Running 0 5m
udp-server-deployment-6f87f5c9-g9wrr 1/1 Running 0 5m
udp-server-deployment-6f87f5c9-ntbkc 1/1 Running 0 5m
udp-server-deployment-6f87f5c9-xlbjq 1/1 Running 0 5m
packet#test:~$ kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 1h
udp-server-service NodePort 10.102.67.0 <none> 10001:30001/UDP 6m
but still I am not able to access udp-server pod:
packet#test:~$ curl http://192.168.43.161:30001
curl: (7) Failed to connect to 192.168.43.161 port 30001: Connection refused
while debugging i could see kube-proxy is running but there is no entry in IPVS:
root#test:~# ps auxw | grep kube-proxy
root 4050 0.5 0.7 44340 29952 ? Ssl 14:33 0:25 /usr/local/bin/kube-proxy --config=/var/lib/kube-proxy/config.conf
root 6094 0.0 0.0 14224 968 pts/1 S+ 15:48 0:00 grep --color=auto kube-proxy
root#test:~# ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
Seems to be there is no entry in ipvsadm causing connection time out.
Regards, Ranjith
From this issue (putting aside the load balancer part),
Both externalIPs and status.loadBalancer.ingress[].ip seem to be ignored by kube-proxy in IPVS mode, so external traffic is completely unrouteable.
In contrast, kube-proxy in iptables mode creates DNAT/SNAT rules for external and loadbalancer IPs.
So check if adding a network plugin (flannel, Calico, ...) would improve the situation.
Or check out cloudnativelabs/kube-router, which is also ipvs-based.
A lean yet powerful alternative to several network components used in typical Kubernetes clusters.
All this from a single DaemonSet/Binary. It doesn't get any easier.
Since curl use tcp connection, while 30001 is a udp port, they don't work together, try a udp probe tool, like nmap.
initially I have created VM(Linux VM) using virtual box(running on window),where I found this type of issue.
Now i have created VM(Linux VM) using virtual manager(running on Linux),in this set up there is no issue and every thing works fine.
It would be great if any one tell is there any restriction from virtual box?