kube-scheduler Liveness probe failed: Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused - kubernetes

So I have this unhealthy cluster partially working in the datacenter. This is probably the 10th time I have rebuilt from the instructions at: https://kubernetes.io/docs/setup/independent/high-availability/
I can apply some pods to this cluster and it seems to work but eventually it starts slowing down and crashing as you can see below. Here is the scheduler manifest:
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
component: kube-scheduler
tier: control-plane
name: kube-scheduler
namespace: kube-system
spec:
containers:
- command:
- kube-scheduler
- --bind-address=127.0.0.1
- --kubeconfig=/etc/kubernetes/scheduler.conf
- --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
- --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
- --leader-elect=true
image: k8s.gcr.io/kube-scheduler:v1.14.2
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 8
httpGet:
host: 127.0.0.1
path: /healthz
port: 10251
scheme: HTTP
initialDelaySeconds: 15
timeoutSeconds: 15
name: kube-scheduler
resources:
requests:
cpu: 100m
volumeMounts:
- mountPath: /etc/kubernetes/scheduler.conf
name: kubeconfig
readOnly: true
hostNetwork: true
priorityClassName: system-cluster-critical
volumes:
- hostPath:
path: /etc/kubernetes/scheduler.conf
type: FileOrCreate
name: kubeconfig
status: {}
$ kubectl -n kube-system get pods
NAME READY STATUS RESTARTS AGE
coredns-fb8b8dccf-42psn 1/1 Running 9 88m
coredns-fb8b8dccf-x9mlt 1/1 Running 11 88m
docker-registry-dqvzb 1/1 Running 1 2d6h
kube-apiserver-kube-apiserver-1 1/1 Running 44 2d8h
kube-apiserver-kube-apiserver-2 1/1 Running 34 2d7h
kube-controller-manager-kube-apiserver-1 1/1 Running 198 2d2h
kube-controller-manager-kube-apiserver-2 0/1 CrashLoopBackOff 170 2d7h
kube-flannel-ds-amd64-4mbfk 1/1 Running 1 2d7h
kube-flannel-ds-amd64-55hc7 1/1 Running 1 2d8h
kube-flannel-ds-amd64-fvwmf 1/1 Running 1 2d7h
kube-flannel-ds-amd64-ht5wm 1/1 Running 3 2d7h
kube-flannel-ds-amd64-rjt9l 1/1 Running 4 2d8h
kube-flannel-ds-amd64-wpmkj 1/1 Running 1 2d7h
kube-proxy-2n64d 1/1 Running 3 2d7h
kube-proxy-2pq2g 1/1 Running 1 2d7h
kube-proxy-5fbms 1/1 Running 2 2d8h
kube-proxy-g8gmn 1/1 Running 1 2d7h
kube-proxy-wrdrj 1/1 Running 1 2d8h
kube-proxy-wz6gv 1/1 Running 1 2d7h
kube-scheduler-kube-apiserver-1 0/1 CrashLoopBackOff 198 2d2h
kube-scheduler-kube-apiserver-2 1/1 Running 5 18m
nginx-ingress-controller-dz8fm 1/1 Running 3 2d4h
nginx-ingress-controller-sdsgg 1/1 Running 3 2d4h
nginx-ingress-controller-sfrgb 1/1 Running 1 2d4h
$ kubectl -n kube-system describe pod kube-scheduler-kube-apiserver-1
Containers:
kube-scheduler:
Container ID: docker://c04f3c9061cafef8749b2018cd66e6865d102f67c4d13bdd250d0b4656d5f220
Image: k8s.gcr.io/kube-scheduler:v1.14.2
Image ID: docker-pullable://k8s.gcr.io/kube-scheduler#sha256:052e0322b8a2b22819ab0385089f202555c4099493d1bd33205a34753494d2c2
Port: <none>
Host Port: <none>
Command:
kube-scheduler
--bind-address=127.0.0.1
--kubeconfig=/etc/kubernetes/scheduler.conf
--authentication-kubeconfig=/etc/kubernetes/scheduler.conf
--authorization-kubeconfig=/etc/kubernetes/scheduler.conf
--leader-elect=true
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Tue, 28 May 2019 23:16:50 -0400
Finished: Tue, 28 May 2019 23:19:56 -0400
Ready: False
Restart Count: 195
Requests:
cpu: 100m
Liveness: http-get http://127.0.0.1:10251/healthz delay=15s timeout=15s period=10s #success=1 #failure=8
Environment: <none>
Mounts:
/etc/kubernetes/scheduler.conf from kubeconfig (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kubeconfig:
Type: HostPath (bare host directory volume)
Path: /etc/kubernetes/scheduler.conf
HostPathType: FileOrCreate
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: :NoExecute
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Created 4h56m (x104 over 37h) kubelet, kube-apiserver-1 Created container kube-scheduler
Normal Started 4h56m (x104 over 37h) kubelet, kube-apiserver-1 Started container kube-scheduler
Warning Unhealthy 137m (x71 over 34h) kubelet, kube-apiserver-1 Liveness probe failed: Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused
Normal Pulled 132m (x129 over 37h) kubelet, kube-apiserver-1 Container image "k8s.gcr.io/kube-scheduler:v1.14.2" already present on machine
Warning BackOff 128m (x1129 over 34h) kubelet, kube-apiserver-1 Back-off restarting failed container
Normal SandboxChanged 80m kubelet, kube-apiserver-1 Pod sandbox changed, it will be killed and re-created.
Warning Failed 76m kubelet, kube-apiserver-1 Error: context deadline exceeded
Normal Pulled 36m (x7 over 78m) kubelet, kube-apiserver-1 Container image "k8s.gcr.io/kube-scheduler:v1.14.2" already present on machine
Normal Started 36m (x6 over 74m) kubelet, kube-apiserver-1 Started container kube-scheduler
Normal Created 32m (x7 over 74m) kubelet, kube-apiserver-1 Created container kube-scheduler
Warning Unhealthy 20m (x9 over 40m) kubelet, kube-apiserver-1 Liveness probe failed: Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused
Warning BackOff 2m56s (x85 over 69m) kubelet, kube-apiserver-1 Back-off restarting failed container
I feel like I am overlooking a simple option or configuration but I can't find it and after days of dealing with this problem and reading documentation I am at my wits end.
The load balancer is a TCP load balancer and seems to be working as expected as I can query the cluster from my desktop.
Any suggestions or troubleshooting tips are definitely welcome at this time.
Thank you.

The problem with our configuration was that a well intended technician decided to eliminate one of the rules on the kubernetes master firewall which prevented the master from looping back to ports it needed to probe. This caused all kinds of weird issues and misdiagnosed problems which was definitely the wrong direction. After we allowed all ports on the servers Kubernetes was back to its normal behavior.

Related

How to resolve this error that nginx-ingress-controller start fail in my k8s cluster?

Rancher v2.4.2
kubernetes version: v1.17.4
In my k8s cluster,nginx-ingress-controller doesn't work and restart always.I don't get anything useful information in the logs, thanks for your help.
cluster nodes:
> kubectl get nodes
NAME STATUS ROLES AGE VERSION
master1 Ready controlplane,etcd,worker 18d v1.17.4
master2 Ready controlplane,etcd,worker 17d v1.17.4
node1 Ready worker 17d v1.17.4
node2 Ready worker 17d v1.17.4
cluster pods in ingress-nginx namespace
> kubectl get pods -n ingress-nginx
NAME READY STATUS RESTARTS AGE
default-http-backend-5bb77998d7-k7gdh 1/1 Running 1 17d
nginx-ingress-controller-6l4jh 0/1 Running 10 27m
nginx-ingress-controller-bh2pg 1/1 Running 0 63m
nginx-ingress-controller-drtzx 1/1 Running 0 63m
nginx-ingress-controller-qndbw 1/1 Running 0 63m
the pod logs of nginx-ingress-controller-6l4jh
> kubectl logs nginx-ingress-controller-6l4jh -n ingress-nginx
-------------------------------------------------------------------------------
NGINX Ingress controller
Release: nginx-0.25.1-rancher1
Build:
Repository: https://github.com/rancher/ingress-nginx.git
nginx version: openresty/1.15.8.1
-------------------------------------------------------------------------------
>
describe info
> kubectl describe pod nginx-ingress-controller-6l4jh -n ingress-nginx
Name: nginx-ingress-controller-6l4jh
Namespace: ingress-nginx
Priority: 0
Node: node2/172.26.13.11
Start Time: Tue, 19 Apr 2022 07:12:16 +0000
Labels: app=ingress-nginx
controller-revision-hash=758cb9dbbc
pod-template-generation=8
Annotations: cattle.io/timestamp: 2022-04-19T07:08:51Z
field.cattle.io/ports:
[[{"containerPort":80,"dnsName":"nginx-ingress-controller-hostport","hostPort":80,"kind":"HostPort","name":"http","protocol":"TCP","source...
field.cattle.io/publicEndpoints:
[{"addresses":["172.26.13.130"],"nodeId":"c-wv692:m-d5802d05bbf0","port":80,"protocol":"TCP"},{"addresses":["172.26.13.130"],"nodeId":"c-w...
prometheus.io/port: 10254
prometheus.io/scrape: true
Status: Running
IP: 172.26.13.11
IPs:
IP: 172.26.13.11
Controlled By: DaemonSet/nginx-ingress-controller
Containers:
nginx-ingress-controller:
Container ID: docker://09a6248edb921b9c9cbab678c793fe1cc3d28322ea6abbb8f15c899351ce4b40
Image: 172.26.13.133:5000/rancher/nginx-ingress-controller:nginx-0.25.1-rancher1
Image ID: docker-pullable://172.26.13.133:5000/rancher/nginx-ingress-controller#sha256:fe50ceea3d1a0bc9a7ccef8d5845c9a30b51f608e411467862dff590185a47d2
Ports: 80/TCP, 443/TCP
Host Ports: 80/TCP, 443/TCP
Args:
/nginx-ingress-controller
--default-backend-service=$(POD_NAMESPACE)/default-http-backend
--configmap=$(POD_NAMESPACE)/nginx-configuration
--tcp-services-configmap=$(POD_NAMESPACE)/tcp-services
--udp-services-configmap=$(POD_NAMESPACE)/udp-services
--annotations-prefix=nginx.ingress.kubernetes.io
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 143
Started: Tue, 19 Apr 2022 07:40:12 +0000
Finished: Tue, 19 Apr 2022 07:41:32 +0000
Ready: False
Restart Count: 11
Liveness: http-get http://:10254/healthz delay=60s timeout=20s period=10s #success=1 #failure=3
Readiness: http-get http://:10254/healthz delay=60s timeout=20s period=10s #success=1 #failure=3
Environment:
POD_NAME: nginx-ingress-controller-6l4jh (v1:metadata.name)
POD_NAMESPACE: ingress-nginx (v1:metadata.namespace)
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from nginx-ingress-serviceaccount-token-2kdbj (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
nginx-ingress-serviceaccount-token-2kdbj:
Type: Secret (a volume populated by a Secret)
SecretName: nginx-ingress-serviceaccount-token-2kdbj
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: :NoExecute
:NoSchedule
node.kubernetes.io/disk-pressure:NoSchedule
node.kubernetes.io/memory-pressure:NoSchedule
node.kubernetes.io/network-unavailable:NoSchedule
node.kubernetes.io/not-ready:NoExecute
node.kubernetes.io/pid-pressure:NoSchedule
node.kubernetes.io/unreachable:NoExecute
node.kubernetes.io/unschedulable:NoSchedule
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned ingress-nginx/nginx-ingress-controller-6l4jh to node2
Normal Pulled 27m (x3 over 30m) kubelet, node2 Container image "172.26.13.133:5000/rancher/nginx-ingress-controller:nginx-0.25.1-rancher1" already present on machine
Normal Created 27m (x3 over 30m) kubelet, node2 Created container nginx-ingress-controller
Normal Started 27m (x3 over 30m) kubelet, node2 Started container nginx-ingress-controller
Normal Killing 27m (x2 over 28m) kubelet, node2 Container nginx-ingress-controller failed liveness probe, will be restarted
Warning Unhealthy 25m (x10 over 29m) kubelet, node2 Liveness probe failed: Get http://172.26.13.11:10254/healthz: dial tcp 172.26.13.11:10254: connect: connection refused
Warning Unhealthy 10m (x21 over 29m) kubelet, node2 Readiness probe failed: Get http://172.26.13.11:10254/healthz: dial tcp 172.26.13.11:10254: connect: connection refused
Warning BackOff 8s (x69 over 20m) kubelet, node2 Back-off restarting failed container
>
It sounds like the ingress controller pod fails the liveness/readiness checks but looks like only on a certain node. You could try:
check the node for firewall on that port
update to newer version than nginx-0.25.1

helm chart deployment liveness and readiness failed error

I have an Openshift cluster. I have created one custom application and am trying to deploy it using Helm charts. When I deploy it on Openshift using 'oc new-app', the deployments works perfectly fine. But when I deploy it using helm chart, it does not work.
Following is the output of 'oc get all' ----
[root#worker2 ~]#
[root#worker2 ~]# oc get all
NAME READY STATUS RESTARTS AGE
pod/chart-acme-85648d4645-7msdl 1/1 Running 0 3d7h
pod/chart1-acme-f8b65b78d-k2fb6 1/1 Running 0 3d7h
pod/netshoot 1/1 Running 0 3d10h
pod/sample1-buildachart-5b5d9d8649-qqmsf 0/1 CrashLoopBackOff 672 2d9h
pod/sample2-686bb7f969-fx5bk 0/1 CrashLoopBackOff 674 2d9h
pod/vjobs-npm-96b65fcb-b2p27 0/1 CrashLoopBackOff 817 47h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/chart-acme LoadBalancer 172.30.174.208 <pending> 80:30222/TCP 3d7h
service/chart1-acme LoadBalancer 172.30.153.36 <pending> 80:30383/TCP 3d7h
service/sample1-buildachart NodePort 172.30.29.124 <none> 80:32375/TCP 2d9h
service/sample2 NodePort 172.30.19.24 <none> 80:32647/TCP 2d9h
service/vjobs-npm NodePort 172.30.205.30 <none> 80:30476/TCP 47h
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/chart-acme 1/1 1 1 3d7h
deployment.apps/chart1-acme 1/1 1 1 3d7h
deployment.apps/sample1-buildachart 0/1 1 0 2d9h
deployment.apps/sample2 0/1 1 0 2d9h
deployment.apps/vjobs-npm 0/1 1 0 47h
NAME DESIRED CURRENT READY AGE
replicaset.apps/chart-acme-85648d4645 1 1 1 3d7h
replicaset.apps/chart1-acme-f8b65b78d 1 1 1 3d7h
replicaset.apps/sample1-buildachart-5b5d9d8649 1 1 0 2d9h
replicaset.apps/sample2-686bb7f969 1 1 0 2d9h
replicaset.apps/vjobs-npm-96b65fcb 1 1 0 47h
[root#worker2 ~]#
[root#worker2 ~]#
As per above diagram, you can see the deployment 'vjobs-npm' gives a 'CrashLoopBackOff' error.
Below is the output of 'oc describe pod' ----
[root#worker2 ~]#
[root#worker2 ~]# oc describe pod vjobs-npm-96b65fcb-b2p27
Name: vjobs-npm-96b65fcb-b2p27
Namespace: vjobs-testing
Priority: 0
Node: worker0/192.168.100.109
Start Time: Mon, 31 Aug 2020 09:30:28 -0400
Labels: app.kubernetes.io/instance=vjobs-npm
app.kubernetes.io/name=vjobs-npm
pod-template-hash=96b65fcb
Annotations: openshift.io/scc: restricted
Status: Running
IP: 10.131.0.107
IPs:
IP: 10.131.0.107
Controlled By: ReplicaSet/vjobs-npm-96b65fcb
Containers:
vjobs-npm:
Container ID: cri-o://c232849eb25bd96ae9343ac3ed1539d492985dd8cdf47a5a4df7d3cf776c4cf3
Image: quay.io/aditya7002/vjobs_local_build_new:latest
Image ID: quay.io/aditya7002/vjobs_local_build_new#sha256:87f18e3a24fc7043a43a143e96b0b069db418ace027d95a5427cf53de56feb4c
Port: 80/TCP
Host Port: 0/TCP
State: Running
Started: Mon, 31 Aug 2020 09:31:23 -0400
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Mon, 31 Aug 2020 09:30:31 -0400
Finished: Mon, 31 Aug 2020 09:31:22 -0400
Ready: False
Restart Count: 1
Liveness: http-get http://:http/ delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:http/ delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from vjobs-npm-token-vw6f7 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
vjobs-npm-token-vw6f7:
Type: Secret (a volume populated by a Secret)
SecretName: vjobs-npm-token-vw6f7
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 68s default-scheduler Successfully assigned vjobs-testing/vjobs-npm-96b65fcb-b2p27 to worker0
Normal Killing 44s kubelet, worker0 Container vjobs-npm failed liveness probe, will be restarted
Normal Pulling 14s (x2 over 66s) kubelet, worker0 Pulling image "quay.io/aditya7002/vjobs_local_build_new:latest"
Normal Pulled 13s (x2 over 65s) kubelet, worker0 Successfully pulled image "quay.io/aditya7002/vjobs_local_build_new:latest"
Normal Created 13s (x2 over 65s) kubelet, worker0 Created container vjobs-npm
Normal Started 13s (x2 over 65s) kubelet, worker0 Started container vjobs-npm
Warning Unhealthy 4s (x4 over 64s) kubelet, worker0 Liveness probe failed: Get http://10.131.0.107:80/: dial tcp 10.131.0.107:80:
connect: connection refused
Warning Unhealthy 1s (x7 over 61s) kubelet, worker0 Readiness probe failed: Get http://10.131.0.107:80/: dial tcp 10.131.0.107:80
: connect: connection refused
[root#worker2 ~]#

Istio BookInfo sample pods not starting on Minishift 3.11.0 - Init:CrashLoopBackOff - message: 'containers with incomplete status: [istio-init]'

I have a fresh installation of minishift (v1.32.0+009893b) running on MacOS Mojave.
I start minishift with 4 CPUs and 8GB RAM: minishift start --cpus 4 --memory 8GB
I have followed the instructions to prepare the Openshift (minishift) environment described here: https://istio.io/docs/setup/kubernetes/prepare/platform-setup/openshift/
I've installed Istio following the documentation without any error: https://istio.io/docs/setup/kubernetes/install/kubernetes/
istio-system namespace pods
$> kubectl get pod -n istio-system
grafana-7b46bf6b7c-27pn8 1/1 Running 1 26m
istio-citadel-5878d994cc-5tsx2 1/1 Running 1 26m
istio-cleanup-secrets-1.1.1-vwzq5 0/1 Completed 0 26m
istio-egressgateway-976f94bd-pst7g 1/1 Running 1 26m
istio-galley-7855cc97dc-s7wvt 1/1 Running 0 1m
istio-grafana-post-install-1.1.1-nvdvl 0/1 Completed 0 26m
istio-ingressgateway-794cfcf8bc-zkfnc 1/1 Running 1 26m
istio-pilot-746995884c-6l8jm 2/2 Running 2 26m
istio-policy-74c95b5657-g2cvq 2/2 Running 10 26m
istio-security-post-install-1.1.1-f4524 0/1 Completed 0 26m
istio-sidecar-injector-59fc9d6f7d-z48rc 1/1 Running 1 26m
istio-telemetry-6c5d7b55bf-cmnvp 2/2 Running 10 26m
istio-tracing-75dd89b8b4-pp9c5 1/1 Running 2 26m
kiali-5d68f4c676-5lsj9 1/1 Running 1 26m
prometheus-89bc5668c-rbrd7 1/1 Running 1 26m
I deploy the BookInfo sample in my istio-test namespace: istioctl kube-inject -f bookinfo.yaml | kubectl -n istio-test apply -f - but the pods don't start.
oc command info
$> oc get svc
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
details 172.30.204.102 <none> 9080/TCP 21m
productpage 172.30.72.33 <none> 9080/TCP 21m
ratings 172.30.10.155 <none> 9080/TCP 21m
reviews 172.30.169.6 <none> 9080/TCP 21m
$> kubectl get pods
NAME READY STATUS RESTARTS AGE
details-v1-5c879644c7-vtb6g 0/2 Init:CrashLoopBackOff 12 21m
productpage-v1-59dff9bdf9-l2r2d 0/2 Init:CrashLoopBackOff 12 21m
ratings-v1-89485cb9c-vk58r 0/2 Init:CrashLoopBackOff 12 21m
reviews-v1-5db4f45f5d-ddqrm 0/2 Init:CrashLoopBackOff 12 21m
reviews-v2-575959b5b7-8gppt 0/2 Init:CrashLoopBackOff 12 21m
reviews-v3-79b65d46b4-zs865 0/2 Init:CrashLoopBackOff 12 21m
For some reason the init containers (istio-init) are crashing:
oc describe pod details-v1-5c879644c7-vtb6g
Name: details-v1-5c879644c7-vtb6g
Namespace: istio-test
Node: localhost/192.168.64.13
Start Time: Sat, 30 Mar 2019 14:38:49 +0100
Labels: app=details
pod-template-hash=1743520073
version=v1
Annotations: openshift.io/scc=privileged
sidecar.istio.io/status={"version":"b83fa303cbac0223b03f9fc5fbded767303ad2f7992390bfda6b9be66d960332","initContainers":["istio-init"],"containers":["istio-proxy"],"volumes":["istio-envoy","istio-certs...
Status: Pending
IP: 172.17.0.24
Controlled By: ReplicaSet/details-v1-5c879644c7
Init Containers:
istio-init:
Container ID: docker://0d8b62ad72727f39d8a4c9278592c505ccbcd52ed8038c606b6256056a3a8d12
Image: docker.io/istio/proxy_init:1.1.1
Image ID: docker-pullable://docker.io/istio/proxy_init#sha256:5008218de88915f0b45930d69c5cdd7cd4ec94244e9ff3cfe3cec2eba6d99440
Port: <none>
Args:
-p
15001
-u
1337
-m
REDIRECT
-i
*
-x
-b
9080
-d
15020
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Sat, 30 Mar 2019 14:58:18 +0100
Finished: Sat, 30 Mar 2019 14:58:19 +0100
Ready: False
Restart Count: 12
Limits:
cpu: 100m
memory: 50Mi
Requests:
cpu: 10m
memory: 10Mi
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-58j6f (ro)
Containers:
details:
Container ID:
Image: istio/examples-bookinfo-details-v1:1.10.1
Image ID:
Port: 9080/TCP
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-58j6f (ro)
istio-proxy:
Container ID:
Image: docker.io/istio/proxyv2:1.1.1
Image ID:
Port: 15090/TCP
Args:
proxy
sidecar
--domain
$(POD_NAMESPACE).svc.cluster.local
--configPath
/etc/istio/proxy
--binaryPath
/usr/local/bin/envoy
--serviceCluster
details.$(POD_NAMESPACE)
--drainDuration
45s
--parentShutdownDuration
1m0s
--discoveryAddress
istio-pilot.istio-system:15010
--zipkinAddress
zipkin.istio-system:9411
--connectTimeout
10s
--proxyAdminPort
15000
--concurrency
2
--controlPlaneAuthPolicy
NONE
--statusPort
15020
--applicationPorts
9080
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Limits:
cpu: 2
memory: 128Mi
Requests:
cpu: 10m
memory: 40Mi
Readiness: http-get http://:15020/healthz/ready delay=1s timeout=1s period=2s #success=1 #failure=30
Environment:
POD_NAME: details-v1-5c879644c7-vtb6g (v1:metadata.name)
POD_NAMESPACE: istio-test (v1:metadata.namespace)
INSTANCE_IP: (v1:status.podIP)
ISTIO_META_POD_NAME: details-v1-5c879644c7-vtb6g (v1:metadata.name)
ISTIO_META_CONFIG_NAMESPACE: istio-test (v1:metadata.namespace)
ISTIO_META_INTERCEPTION_MODE: REDIRECT
ISTIO_METAJSON_LABELS: {"app":"details","version":"v1"}
Mounts:
/etc/certs/ from istio-certs (ro)
/etc/istio/proxy from istio-envoy (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-58j6f (ro)
Conditions:
Type Status
Initialized False
Ready False
ContainersReady False
PodScheduled True
Volumes:
istio-envoy:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium: Memory
istio-certs:
Type: Secret (a volume populated by a Secret)
SecretName: istio.default
Optional: true
default-token-58j6f:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-58j6f
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/memory-pressure:NoSchedule
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
23m 23m 1 default-scheduler Normal Scheduled Successfully assigned istio-test/details-v1-5c879644c7-vtb6g to localhost
23m 23m 1 kubelet, localhost spec.initContainers{istio-init} Normal Pulling pulling image "docker.io/istio/proxy_init:1.1.1"
22m 22m 1 kubelet, localhost spec.initContainers{istio-init} Normal Pulled Successfully pulled image "docker.io/istio/proxy_init:1.1.1"
22m 21m 5 kubelet, localhost spec.initContainers{istio-init} Normal Created Created container
22m 21m 5 kubelet, localhost spec.initContainers{istio-init} Normal Started Started container
22m 21m 4 kubelet, localhost spec.initContainers{istio-init} Normal Pulled Container image "docker.io/istio/proxy_init:1.1.1" already present on machine
22m 17m 24 kubelet, localhost spec.initContainers{istio-init} Warning BackOff Back-off restarting failed container
9m 9m 1 kubelet, localhost Normal SandboxChanged Pod sandbox changed, it will be killed and re-created.
9m 8m 4 kubelet, localhost spec.initContainers{istio-init} Normal Pulled Container image "docker.io/istio/proxy_init:1.1.1" already present on machine
9m 8m 4 kubelet, localhost spec.initContainers{istio-init} Normal Created Created container
9m 8m 4 kubelet, localhost spec.initContainers{istio-init} Normal Started Started container
9m 3m 31 kubelet, localhost spec.initContainers{istio-init} Warning BackOff Back-off restarting failed container
I can't see any info that gives any hint appart from Exit code: 1 and
status:
conditions:
- lastProbeTime: null
lastTransitionTime: '2019-03-30T13:38:50Z'
message: 'containers with incomplete status: [istio-init]'
reason: ContainersNotInitialized
status: 'False'
type: Initialized
UPDATE:
This is the istio-init Init container log:
kubectl -n istio-test logs -f details-v1-5c879644c7-m9k6q istio-init
Environment:
------------
ENVOY_PORT=
ISTIO_INBOUND_INTERCEPTION_MODE=
ISTIO_INBOUND_TPROXY_MARK=
ISTIO_INBOUND_TPROXY_ROUTE_TABLE=
ISTIO_INBOUND_PORTS=
ISTIO_LOCAL_EXCLUDE_PORTS=
ISTIO_SERVICE_CIDR=
ISTIO_SERVICE_EXCLUDE_CIDR=
Variables:
----------
PROXY_PORT=15001
INBOUND_CAPTURE_PORT=15001
PROXY_UID=1337
INBOUND_INTERCEPTION_MODE=REDIRECT
INBOUND_TPROXY_MARK=1337
INBOUND_TPROXY_ROUTE_TABLE=133
INBOUND_PORTS_INCLUDE=9080
INBOUND_PORTS_EXCLUDE=15020
OUTBOUND_IP_RANGES_INCLUDE=*
OUTBOUND_IP_RANGES_EXCLUDE=
KUBEVIRT_INTERFACES=
ENABLE_INBOUND_IPV6=
# Generated by iptables-save v1.6.0 on Sat Mar 30 22:21:52 2019
*nat
:PREROUTING ACCEPT [0:0]
:INPUT ACCEPT [0:0]
:OUTPUT ACCEPT [0:0]
:POSTROUTING ACCEPT [0:0]
:ISTIO_REDIRECT - [0:0]
COMMIT
# Completed on Sat Mar 30 22:21:52 2019
# Generated by iptables-save v1.6.0 on Sat Mar 30 22:21:52 2019
*filter
:INPUT ACCEPT [3:180]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [3:120]
COMMIT
# Completed on Sat Mar 30 22:21:52 2019
+ iptables -t nat -N ISTIO_REDIRECT
+ iptables -t nat -A ISTIO_REDIRECT -p tcp -j REDIRECT --to-port 15001
iptables: No chain/target/match by that name.
+ dump
+ iptables-save
+ ip6tables-save
I solved the problem adding privileged: true in the istio-init pod securityContext configuration:
name: istio-init
resources:
limits:
cpu: 100m
memory: 50Mi
requests:
cpu: 10m
memory: 10Mi
securityContext:
capabilities:
add:
- NET_ADMIN
privileged: true

Unable to setup Istio with minikube

I followed Istio's official documentation to setup Istio for sample bookinfo app with minikube. but I'm getting Unable to connect to the server: net/http: TLS handshake timeout error. these are the steps that I have followed(I have kubectl & minikube installed).
minikube start
curl -L https://git.io/getLatestIstio | sh -
cd istio-1.0.3
export PATH=$PWD/bin:$PATH
kubectl apply -f install/kubernetes/helm/istio/templates/crds.yaml
kubectl apply -f install/kubernetes/istio-demo-auth.yaml
kubectl get pods -n istio-system
This is the terminal output I'm getting
$ kubectl get pods -n istio-system
NAME READY STATUS RESTARTS AGE
grafana-9cfc9d4c9-xg7bh 1/1 Running 0 4m
istio-citadel-6d7f9c545b-lwq8s 1/1 Running 0 3m
istio-cleanup-secrets-69hdj 0/1 Completed 0 4m
istio-egressgateway-75dbb8f95d-k6xj2 1/1 Running 0 4m
istio-galley-6d74549bb9-mdc97 0/1 ContainerCreating 0 4m
istio-grafana-post-install-xz9rk 0/1 Completed 0 4m
istio-ingressgateway-6bd4957bc-vhbct 1/1 Running 0 4m
istio-pilot-7f8c49bbd8-x6bmm 0/2 Pending 0 4m
istio-policy-6c65d8cff4-hx2c7 2/2 Running 0 4m
istio-security-post-install-gjfj2 0/1 Completed 0 4m
istio-sidecar-injector-74855c54b9-nnqgx 0/1 ContainerCreating 0 3m
istio-telemetry-65cdd46d6c-rqzfw 2/2 Running 0 4m
istio-tracing-ff94688bb-hgz4h 1/1 Running 0 3m
prometheus-f556886b8-chdxw 1/1 Running 0 4m
servicegraph-778f94d6f8-9xgw5 1/1 Running 0 3m
$kubectl describe pod istio-galley-6d74549bb9-mdc97
Error from server (NotFound): pods "istio-galley-5bf4d6b8f7-8s2z9" not found
pod describe output
$ kubectl -n istio-system describe pod istio-galley-6d74549bb9-mdc97
Name: istio-galley-6d74549bb9-mdc97
Namespace: istio-system
Node: minikube/172.17.0.4
Start Time: Sat, 03 Nov 2018 04:29:57 +0000
Labels: istio=galley
pod-template-hash=1690826493
Annotations: scheduler.alpha.kubernetes.io/critical-pod=
sidecar.istio.io/inject=false
Status: Pending
IP:
Controlled By: ReplicaSet/istio-galley-5bf4d6b8f7
Containers:
validator:
Container ID:
Image: gcr.io/istio-release/galley:1.0.0 Image ID:
Ports: 443/TCP, 9093/TCP Host Ports: 0/TCP, 0/TCP
Command: /usr/local/bin/galley
validator --deployment-namespace=istio-system
--caCertFile=/etc/istio/certs/root-cert.pem
--tlsCertFile=/etc/istio/certs/cert-chain.pem
--tlsKeyFile=/etc/istio/certs/key.pem
--healthCheckInterval=2s
--healthCheckFile=/health
--webhook-config-file
/etc/istio/config/validatingwebhookconfiguration.yaml
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Requests:
cpu: 10m
Liveness: exec [/usr/local/bin/galley probe --probe-path=/health --interval=4s] delay=4s timeout=1s period=4s #success=1 #failure=3
Readiness: exec [/usr/local/bin/galley probe --probe-path=/health --interval=4s] delay=4s timeout=1s period=4s #success=1 #failure=3
Environment: <none>
Mounts:
/etc/istio/certs from certs (ro)
/etc/istio/config from config (ro)
/var/run/secrets/kubernetes.io/serviceaccount from istio-galley-service-account-token-9pcmv(ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
certs:
Type: Secret (a volume populated by a Secret)
SecretName: istio.istio-galley-service-account
Optional: false
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: istio-galley-configuration
Optional: false
istio-galley-service-account-token-9pcmv:
Type: Secret (a volume populated by a Secret)
SecretName: istio-galley-service-account-token-9pcmv
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 1m default-scheduler Successfully assigned istio-galley-5bf4d6b8f7-8t8qz to minikube
Normal SuccessfulMountVolume 1m kubelet, minikube MountVolume.SetUp succeeded for volume "config"
Normal SuccessfulMountVolume 1m kubelet, minikube MountVolume.SetUp succeeded for volume "istio-galley-service-account-token-9pcmv"
Warning FailedMount 27s (x7 over 1m) kubelet, minikube MountVolume.SetUp failed for volume "certs" : secrets "istio.istio-galley-service-account" not found
after some time :-
$ kubectl describe pod istio-galley-6d74549bb9-mdc97
Unable to connect to the server: net/http: TLS handshake timeout
so I wait for istio-sidecar-injector and istio-galley containers to get created. If I again run kubectl get pods -n istio-system or any other kubectl commands gives Unable to connect to the server: net/http: TLS handshake timeout error.
Please help me with this issue.
ps: I'm running minikube on ubuntu 16.04
Thanks in advance.
Looks like you are running into this and this the secret istio.istio-galley-service-account is missing in your istio-system namespace. You can try the workaround as described:
Install as outlined in the docs: https://istio.io/docs/setup/kubernetes/minimal-install/ the missing secret is created by the citadel pod which isn't running due to the --set security.enabled=false flag, setting that to true starts citadel and the secret is created.
Problem resolved. when I run minikube start --memory=4048. maybe it was a memory issue.
When using either the istio-demo.yaml or istio-demo-auth.yaml, you'll find that a minimum of 4GB RAM is required to run Istio (particularly when you deploy its sample app, BookInfo, too). This is true whether your running MiniKube or Docker Desktop and is one of the gotchas that Meshery identifies and attempts to help those deploying Istio or other service meshes circumvent.

issue on arm64: no endpoints,code:503

Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.4", GitCommit:"d6f433224538d4f9ca2f7ae19b252e6fcb66a3ae", GitTreeState:"clean", BuildDate:"2017-05-19T18:44:27Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/arm64"}
Server Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.2", GitCommit:"08e099554f3c31f6e6f07b448ab3ed78d0520507", GitTreeState:"clean", BuildDate:"2017-01-12T04:52:34Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/arm64"}
Environment:
OS (e.g. from /etc/os-release):
NAME="Ubuntu"
VERSION="16.04.2 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.2 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial
Kernel (e.g. uname -a):
Linux node4 4.11.0-rc6-next-20170411-00286-gcc55807 #0 SMP PREEMPT Mon Jun 5 18:56:20 CST 2017 aarch64 aarch64 aarch64 GNU/Linux
What happened:
I want to use kube-deploy/master.sh to setup master on ARM64, but I encountered the error when visiting $myip:8080/ui:
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {},
"status": "Failure",
"message": "no endpoints available for service "kubernetes-dashboard"",
"reason": "ServiceUnavailable",
"code": 503
}
My branch is 2017-2-7 (c8d6fbfc…)
by the way, It can work on X86-amd64 platform by using the same steps to install.
Anything else we need to know:
5.1 kubectl get pod --namespace=kube-system
k8s-master-10.193.20.23 4/4 Running 17 1h
k8s-proxy-v1-sk8vd 1/1 Running 0 1h
kube-addon-manager-10.193.20.23 2/2 Running 2 1h
kube-dns-3365905565-xvj7n 2/4 CrashLoopBackOff 65 1h
kubernetes-dashboard-1416335539-lhlhz 0/1 CrashLoopBackOff 22 1h
5.2 kubectl describe pods kubernetes-dashboard-1416335539-lhlhz --namespace=kube-system
Name: kubernetes-dashboard-1416335539-lhlhz
Namespace: kube-system
Node: 10.193.20.23/10.193.20.23
Start Time: Mon, 12 Jun 2017 10:04:07 +0800
Labels: k8s-app=kubernetes-dashboard
pod-template-hash=1416335539
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"kube-system","name":"kubernetes-dashboard-1416335539","uid":"6ab170d2-4f13-11e7-a...
scheduler.alpha.kubernetes.io/critical-pod=
scheduler.alpha.kubernetes.io/tolerations=[{"key":"CriticalAddonsOnly", "operator":"Exists"}]
Status: Running
IP: 10.1.70.2
Controllers: ReplicaSet/kubernetes-dashboard-1416335539
Containers:
kubernetes-dashboard:
Container ID: docker://fbdbe4c047803b0e98ca7412ca617031f1f31d881e3a5838298a1fda24a1ae18
Image: gcr.io/google_containers/kubernetes-dashboard-arm64:v1.5.0
Image ID: docker-pullable://gcr.io/google_containers/kubernetes-dashboard-arm64#sha256:559d58ef0d8e9dbe78f80060401b97d6262462318c0b8e071937a73896ea1d3d
Port: 9090/TCP
State: Running
Started: Mon, 12 Jun 2017 11:30:03 +0800
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Mon, 12 Jun 2017 11:24:28 +0800
Finished: Mon, 12 Jun 2017 11:24:59 +0800
Ready: True
Restart Count: 23
Limits:
cpu: 100m
memory: 50Mi
Requests:
cpu: 100m
memory: 50Mi
Liveness: http-get http://:9090/ delay=30s timeout=30s period=10s #success=1 #failure=3
Environment:
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-0mnn8 (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
default-token-0mnn8:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-0mnn8
Optional: false
QoS Class: Guaranteed
Node-Selectors:
Tolerations:
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
30m 30m 1 kubelet, 10.193.20.23 spec.containers{kubernetes-dashboard} Normal Killing Killing container with docker id b0562b3640ae: pod "kubernetes-dashboard-1416335539-lhlhz_kube-system(6ab54dba-4f13-11e7-a56b-6805ca369d7f)" container "kubernetes-dashboard" is unhealthy, it will be killed and re-created.
18m 18m 1 kubelet, 10.193.20.23 spec.containers{kubernetes-dashboard} Normal Killing Killing container with docker id 477066c3a00f: pod "kubernetes-dashboard-1416335539-lhlhz_kube-system(6ab54dba-4f13-11e7-a56b-6805ca369d7f)" container "kubernetes-dashboard" is unhealthy, it will be killed and re-created.
12m 12m 1 kubelet, 10.193.20.23 spec.containers{kubernetes-dashboard} Normal Killing Killing container with docker id 3e021d6df31f: pod "kubernetes-dashboard-1416335539-lhlhz_kube-system(6ab54dba-4f13-11e7-a56b-6805ca369d7f)" container "kubernetes-dashboard" is unhealthy, it will be killed and re-created.
11m 11m 1 kubelet, 10.193.20.23 spec.containers{kubernetes-dashboard} Normal Killing Killing container with docker id 43fe3c37817d: pod "kubernetes-dashboard-1416335539-lhlhz_kube-system(6ab54dba-4f13-11e7-a56b-6805ca369d7f)" container "kubernetes-dashboard" is unhealthy, it will be killed and re-created.
5m 5m 1 kubelet, 10.193.20.23 spec.containers{kubernetes-dashboard} Normal Killing Killing container with docker id 23cea72e1f45: pod "kubernetes-dashboard-1416335539-lhlhz_kube-system(6ab54dba-4f13-11e7-a56b-6805ca369d7f)" container "kubernetes-dashboard" is unhealthy, it will be killed and re-created.
1h 5m 7 kubelet, 10.193.20.23 spec.containers{kubernetes-dashboard} Warning Unhealthy Liveness probe failed: Get http://10.1.70.2:9090/: dial tcp 10.1.70.2:9090: getsockopt: connection refused
1h 38s 335 kubelet, 10.193.20.23 spec.containers{kubernetes-dashboard} Warning BackOff Back-off restarting failed docker container
1h 38s 307 kubelet, 10.193.20.23 Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "kubernetes-dashboard" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kubernetes-dashboard pod=kubernetes-dashboard-1416335539-lhlhz_kube-system(6ab54dba-4f13-11e7-a56b-6805ca369d7f)"
1h 27s 24 kubelet, 10.193.20.23 spec.containers{kubernetes-dashboard} Normal Pulled Container image "gcr.io/google_containers/kubernetes-dashboard-arm64:v1.5.0" already present on machine
59m 23s 15 kubelet, 10.193.20.23 spec.containers{kubernetes-dashboard} Normal Created (events with common reason combined)
59m 22s 15 kubelet, 10.193.20.23 spec.containers{kubernetes-dashboard} Normal Started (events with common reason combined)
5.3 kubectl get svc,ep,rc,rs,deploy,pod -o wide --all-namespaces
NAMESPACE NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
default svc/kubernetes 10.0.0.1 443/TCP 16m
kube-system svc/kube-dns 10.0.0.10 53/UDP,53/TCP 16m k8s-app=kube-dns
kube-system svc/kubernetes-dashboard 10.0.0.95 80/TCP 16m k8s-app=kubernetes-dashboard
NAMESPACE NAME ENDPOINTS AGE
default ep/kubernetes 10.193.20.23:6443 16m
kube-system ep/kube-controller-manager <none> 11m
kube-system ep/kube-dns 16m
kube-system ep/kube-scheduler <none> 11m
kube-system ep/kubernetes-dashboard 16m
NAMESPACE NAME DESIRED CURRENT READY AGE CONTAINER(S) IMAGE(S) SELECTOR
kube-system rs/kube-dns-3365905565 1 1 0 16m kubedns,dnsmasq,dnsmasq-metrics,healthz gcr.io/google_containers/kubedns-arm64:1.9,gcr.io/google_containers/kube-dnsmasq-arm64:1.4,gcr.io/google_containers/dnsmasq-metrics-arm64:1.0,gcr.io/google_containers/exechealthz-arm64:1.2 k8s-app=kube-dns,pod-template-hash=3365905565
kube-system rs/kubernetes-dashboard-1416335539 1 1 0 16m kubernetes-dashboard gcr.io/google_containers/kubernetes-dashboard-arm64:v1.5.0 k8s-app=kubernetes-dashboard,pod-template-hash=1416335539
NAMESPACE NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE CONTAINER(S) IMAGE(S) SELECTOR
kube-system deploy/kube-dns 1 1 1 0 16m kubedns,dnsmasq,dnsmasq-metrics,healthz gcr.io/google_containers/kubedns-arm64:1.9,gcr.io/google_containers/kube-dnsmasq-arm64:1.4,gcr.io/google_containers/dnsmasq-metrics-arm64:1.0,gcr.io/google_containers/exechealthz-arm64:1.2 k8s-app=kube-dns
kube-system deploy/kubernetes-dashboard 1 1 1 0 16m kubernetes-dashboard gcr.io/google_containers/kubernetes-dashboard-arm64:v1.5.0 k8s-app=kubernetes-dashboard
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE
kube-system po/k8s-master-10.193.20.23 4/4 Running 50 15m 10.193.20.23 10.193.20.23
kube-system po/k8s-proxy-v1-5b831 1/1 Running 0 16m 10.193.20.23 10.193.20.23
kube-system po/kube-addon-manager-10.193.20.23 2/2 Running 6 15m 10.193.20.23 10.193.20.23
kube-system po/kube-dns-3365905565-jxg4f 1/4 CrashLoopBackOff 20 16m 10.1.5.3 10.193.20.23
kube-system po/kubernetes-dashboard-1416335539-frt3v 0/1 CrashLoopBackOff 7 16m 10.1.5.2 10.193.20.23
5.4 kubectl describe pods kube-dns-3365905565-lb0mq --namespace=kube-system
Name: kube-dns-3365905565-lb0mq
Namespace: kube-system
Node: 10.193.20.23/10.193.20.23
Start Time: Wed, 14 Jun 2017 10:43:46 +0800
Labels: k8s-app=kube-dns
pod-template-hash=3365905565
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"kube-system","name":"kube-dns-3365905565","uid":"4870aec2-50ab-11e7-a420-6805ca36...
scheduler.alpha.kubernetes.io/critical-pod=
scheduler.alpha.kubernetes.io/tolerations=[{"key":"CriticalAddonsOnly", "operator":"Exists"}]
Status: Running
IP: 10.1.20.3
Controllers: ReplicaSet/kube-dns-3365905565
Containers:
kubedns:
Container ID: docker://729562769b48be60a02b62692acd3d1e1c67ac2505f4cb41240067777f45fd77
Image: gcr.io/google_containers/kubedns-arm64:1.9
Image ID: docker-pullable://gcr.io/google_containers/kubedns-arm64#sha256:3c78a2c5b9b86c5aeacf9f5967f206dcf1e64362f3e7f274c1c078c954ecae38
Ports: 10053/UDP, 10053/TCP, 10055/TCP
Args:
--domain=cluster.local.
--dns-port=10053
--config-map=kube-dns
--v=0
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Wed, 14 Jun 2017 10:56:29 +0800
Finished: Wed, 14 Jun 2017 10:58:06 +0800
Ready: False
Restart Count: 6
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/healthz-kubedns delay=60s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:8081/readiness delay=3s timeout=5s period=10s #success=1 #failure=3
Environment:
PROMETHEUS_PORT: 10055
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-1t5v9 (ro)
dnsmasq:
Container ID: docker://b6d7e98a4af2715294764929f901947ab3b985be45d9f213245bd338ab8c3101
Image: gcr.io/google_containers/kube-dnsmasq-arm64:1.4
Image ID: docker-pullable://gcr.io/google_containers/kube-dnsmasq-arm64#sha256:dff5f9e2a521816aa314d469fd8ef961270fe43b4a74bab490385942103f3728
Ports: 53/UDP, 53/TCP
Args:
--cache-size=1000
--no-resolv
--server=127.0.0.1#10053
--log-facility=-
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Wed, 14 Jun 2017 10:55:50 +0800
Finished: Wed, 14 Jun 2017 10:57:26 +0800
Ready: False
Restart Count: 6
Requests:
cpu: 150m
memory: 10Mi
Liveness: http-get http://:8080/healthz-dnsmasq delay=60s timeout=5s period=10s #success=1 #failure=5
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-1t5v9 (ro)
dnsmasq-metrics:
Container ID: docker://51693aea0e732e488b631dcedc082f5a9e23b5b74857217cf005d1e947375367
Image: gcr.io/google_containers/dnsmasq-metrics-arm64:1.0
Image ID: docker-pullable://gcr.io/google_containers/dnsmasq-metrics-arm64#sha256:fc0e8b676a26ed0056b8c68611b74b9b5f3f00c608e5b11ef1608484ce55dd9a
Port: 10054/TCP
Args:
--v=2
--logtostderr
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: ContainerCannotRun
Exit Code: 128
Started: Wed, 14 Jun 2017 10:57:28 +0800
Finished: Wed, 14 Jun 2017 10:57:28 +0800
Ready: False
Restart Count: 7
Requests:
memory: 10Mi
Liveness: http-get http://:10054/metrics delay=60s timeout=5s period=10s #success=1 #failure=5
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-1t5v9 (ro)
healthz:
Container ID: docker://fab7ef143a95ad4d2f6363d5fcdc162eba1522b92726665916462be765289327
Image: gcr.io/google_containers/exechealthz-arm64:1.2
Image ID: docker-pullable://gcr.io/google_containers/exechealthz-arm64#sha256:e8300fde6c36b454cc00b5fffc96d6985622db4d8eb42a9f98f24873e9535b5c
Port: 8080/TCP
Args:
--cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1 >/dev/null
--url=/healthz-dnsmasq
--cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1:10053 >/dev/null
--url=/healthz-kubedns
--port=8080
--quiet
State: Running
Started: Wed, 14 Jun 2017 10:44:31 +0800
Ready: True
Restart Count: 0
Limits:
memory: 50Mi
Requests:
cpu: 10m
memory: 50Mi
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-1t5v9 (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
default-token-1t5v9:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-1t5v9
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: <none>
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
15m 15m 1 default-scheduler Normal Scheduled Successfully assigned kube-dns-3365905565-lb0mq to 10.193.20.23
14m 14m 1 kubelet, 10.193.20.23 spec.containers{kubedns} Normal Created Created container with docker id 2fef2db445e6; Security:[seccomp=unconfined]
14m 14m 1 kubelet, 10.193.20.23 spec.containers{kubedns} Normal Started Started container with docker id 2fef2db445e6
14m 14m 1 kubelet, 10.193.20.23 spec.containers{dnsmasq} Normal Created Created container with docker id 41ec998eeb76; Security:[seccomp=unconfined]
14m 14m 1 kubelet, 10.193.20.23 spec.containers{dnsmasq} Normal Started Started container with docker id 41ec998eeb76
14m 14m 1 kubelet, 10.193.20.23 spec.containers{dnsmasq-metrics} Normal Created Created container with docker id 676ef0e877c8; Security:[seccomp=unconfined]
14m 14m 1 kubelet, 10.193.20.23 spec.containers{healthz} Normal Pulled Container image "gcr.io/google_containers/exechealthz-arm64:1.2" already present on machine
14m 14m 1 kubelet, 10.193.20.23 spec.containers{dnsmasq-metrics} Warning Failed Failed to start container with docker id 676ef0e877c8 with error: Error response from daemon: {"message":"linux spec user: unable to find group nobody: no matching entries in group file"}
14m 14m 1 kubelet, 10.193.20.23 spec.containers{healthz} Normal Created Created container with docker id fab7ef143a95; Security:[seccomp=unconfined]
14m 14m 1 kubelet, 10.193.20.23 spec.containers{healthz} Normal Started Started container with docker id fab7ef143a95
14m 14m 1 kubelet, 10.193.20.23 spec.containers{dnsmasq-metrics} Warning Failed Failed to start container with docker id 45f6bd7f1f3a with error: Error response from daemon: {"message":"linux spec user: unable to find group nobody: no matching entries in group file"}
14m 14m 1 kubelet, 10.193.20.23 spec.containers{dnsmasq-metrics} Normal Created Created container with docker id 45f6bd7f1f3a; Security:[seccomp=unconfined]
14m 14m 1 kubelet, 10.193.20.23 Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "dnsmasq-metrics" with CrashLoopBackOff: "Back-off 10s restarting failed container=dnsmasq-metrics pod=kube-dns-3365905565-lb0mq_kube-system(48845c1a-50ab-11e7-a420-6805ca369d7f)"
14m 14m 1 kubelet, 10.193.20.23 spec.containers{dnsmasq-metrics} Normal Created Created container with docker id 2d1e5adb97bb; Security:[seccomp=unconfined]
14m 14m 1 kubelet, 10.193.20.23 spec.containers{dnsmasq-metrics} Warning Failed Failed to start container with docker id 2d1e5adb97bb with error: Error response from daemon: {"message":"linux spec user: unable to find group nobody: no matching entries in group file"}
14m 14m 2 kubelet, 10.193.20.23 Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "dnsmasq-metrics" with CrashLoopBackOff: "Back-off 20s restarting failed container=dnsmasq-metrics pod=kube-dns-3365905565-lb0mq_kube-system(48845c1a-50ab-11e7-a420-6805ca369d7f)"
So it looks like you have hit a (or several) bugs in Kubernetes. I suggest that you retry with a more recent version (possibly another docker version too). It would be a good idea to report these bugs too (https://github.com/kubernetes/dashboard/issues).
All in all, bear in mind that Kubernetes on arm is an advanced topic and you should expect problems and be ready to debug/resolve them :/
There might be a problem with that docker image (gcr.io/google_containers/dnsmasq-metrics-amd64). Non amd64 stuff is not well tested.
Could you try running:
kubectl set image --namespace=kube-system deployment/kube-dns dnsmasq-metrics=lenart/dnsmasq-metrics-arm64:1.0`
Can't reach dashboard because the dashboard Pod is unhealthy and failing the readiness probe. Because it's not ready it's not considered for the dashboard service so the service has no endpoints which leads to the error message you reported.
The dashboard is most likely unhealthy because kube-dns is not ready (1/4 containers in the Pod ready, should be 4/4).
The kube-dns is most likely unhealthy because you have no pod networking (overlay network) deployed.
Go to the add-ons, pick a network add-on and deploy it. Weave has 1.5 compatible version and requires no setup.
After you have done that give it a few minutes. If you are inpatient just delete the kubernetes-dashboard and kube-dns pods (not the deployment/controller!!). If this does not resolve your problem then please update your question with the new information.