I posted this on serverfault, too, but will hopefully get more views/feedback here:
Trying to get the Dashboard UI working in a kubeadm cluster using kubectl proxy for remote access. Getting
Error: 'dial tcp 192.168.2.3:8443: connect: connection refused'
Trying to reach: 'https://192.168.2.3:8443/'
when accessing http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/ via remote browser.
Looking at API logs, I see that I'm getting the following errors:
I1215 20:18:46.601151 1 log.go:172] http: TLS handshake error from 10.21.72.28:50268: remote error: tls: unknown certificate authority
I1215 20:19:15.444580 1 log.go:172] http: TLS handshake error from 10.21.72.28:50271: remote error: tls: unknown certificate authority
I1215 20:19:31.850501 1 log.go:172] http: TLS handshake error from 10.21.72.28:50275: remote error: tls: unknown certificate authority
I1215 20:55:55.574729 1 log.go:172] http: TLS handshake error from 10.21.72.28:50860: remote error: tls: unknown certificate authority
E1215 21:19:47.246642 1 watch.go:233] unable to encode watch object *v1.WatchEvent: write tcp 134.84.53.162:6443->134.84.53.163:38894: write: connection timed out (&streaming.encoder{writer:(*metrics.fancyResponseWriterDelegator)(0xc42d6fecb0), encoder:(*versioning.codec)(0xc429276990), buf:(*bytes.Buffer)(0xc42cae68c0)})
I presume this is related to not being able to get the Dashboard working, and if so am wondering what the issue with the API server is. Everything else in the cluster appears to be working.
NB, I have admin.conf running locally and am able to access the cluster via kubectl with no issue.
Also, of note is that this had been working when I first got the cluster up. However, I was having networking issues, and had to apply this in order to get CoreDNS to work Coredns service do not work,but endpoint is ok the other SVCs are normal only except dns, so I am wondering if this maybe broke the proxy service?
* EDIT *
Here is output for the dashboard pod:
[gms#thalia0 ~]$ kubectl describe pod kubernetes-dashboard-77fd78f978-tjzxt --namespace=kube-system
Name: kubernetes-dashboard-77fd78f978-tjzxt
Namespace: kube-system
Priority: 0
PriorityClassName: <none>
Node: thalia2.hostdoman/hostip<redacted>
Start Time: Sat, 15 Dec 2018 15:17:57 -0600
Labels: k8s-app=kubernetes-dashboard
pod-template-hash=77fd78f978
Annotations: cni.projectcalico.org/podIP: 192.168.2.3/32
Status: Running
IP: 192.168.2.3
Controlled By: ReplicaSet/kubernetes-dashboard-77fd78f978
Containers:
kubernetes-dashboard:
Container ID: docker://ed5ff580fb7d7b649d2bd1734e5fd80f97c80dec5c8e3b2808d33b8f92e7b472
Image: k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.0
Image ID: docker-pullable://k8s.gcr.io/kubernetes-dashboard-amd64#sha256:1d2e1229a918f4bc38b5a3f9f5f11302b3e71f8397b492afac7f273a0008776a
Port: 8443/TCP
Host Port: 0/TCP
Args:
--auto-generate-certificates
State: Running
Started: Sat, 15 Dec 2018 15:18:04 -0600
Ready: True
Restart Count: 0
Liveness: http-get https://:8443/ delay=30s timeout=30s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/certs from kubernetes-dashboard-certs (rw)
/tmp from tmp-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kubernetes-dashboard-token-mrd9k (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kubernetes-dashboard-certs:
Type: Secret (a volume populated by a Secret)
SecretName: kubernetes-dashboard-certs
Optional: false
tmp-volume:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
kubernetes-dashboard-token-mrd9k:
Type: Secret (a volume populated by a Secret)
SecretName: kubernetes-dashboard-token-mrd9k
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events: <none>
I checked the service:
[gms#thalia0 ~]$ kubectl -n kube-system get service kubernetes-dashboard
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes-dashboard ClusterIP 10.103.93.93 <none> 443/TCP 4d23h
And also of note, if I curl http://localhost:8001/api from the master node, I do get a valid response.
So, in summary, I'm not sure which if any of these errors are the source of not being able to access the dashboard.
I just upgraded my cluster to 1.13.1, in hopes that this issue would be resolved, but alas, no.
When you do kubectl proxy , the default port 8001 only reachable from the localhost. If you ssh to the machine which the kubernetes is installed, you must map this port to your laptop or any device used to ssh.
You can ssh to master node and map the 8001 port to your localbox by :
ssh -L 8001:localhost:8001 hostname#master_node_IP
I upgraded all nodes in the cluster to version 1.13.1 and voila, the dashboard now works AND so far I have not had to apply the CoreDNS fix noted above.
Related
I am following the HashiCorp tutorial and it all looks fine until I try to launch the "webapp" pod - a simple pod whose only function is to demonstrate that it can start and mount a secret volume.
The error (permission denied on a REST call) is shown at the bottom of this command output:
kubectl describe pod webapp
Name: webapp
Namespace: default
Priority: 0
Service Account: webapp-sa
Node: docker-desktop/192.168.65.4
Start Time: Tue, 14 Feb 2023 09:32:07 -0500
Labels: <none>
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Containers:
webapp:
Container ID:
Image: jweissig/app:0.0.1
Image ID:
Port: <none>
Host Port: <none>
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/mnt/secrets-store from secrets-store-inline (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-5b76r (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
secrets-store-inline:
Type: CSI (a Container Storage Interface (CSI) volume source)
Driver: secrets-store.csi.k8s.io
FSType:
ReadOnly: true
VolumeAttributes: secretProviderClass=vault-database
kube-api-access-5b76r:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 42m default-scheduler Successfully assigned default/webapp to docker-desktop
Warning FailedMount 20m (x8 over 40m) kubelet Unable to attach or mount volumes: unmounted volumes=[secrets-store-inline], unattached volumes=[secrets-store-inline kube-api-access-5b76r]: timed out waiting for the condition
Warning FailedMount 12m (x23 over 42m) kubelet MountVolume.SetUp failed for volume "secrets-store-inline" : rpc error: code = Unknown desc = failed to mount secrets store objects for pod default/webapp, err: rpc error: code = Unknown desc = error making mount request: couldn't read secret "db-password": Error making API request.
URL: GET http://vault.default:8200/v1/secret/data/db-pass
Code: 403. Errors:
* 1 error occurred:
* permission denied
Warning FailedMount 2m19s (x4 over 38m) kubelet Unable to attach or mount volumes: unmounted volumes=[secrets-store-inline], unattached volumes=[kube-api-access-5b76r secrets-store-inline]: timed out waiting for the condition
So it seems that this REST call fails: GET http://vault.default:8200/v1/secret/data/db-pass. Indeed, it fails from curl as well:
curl -vik -H "X-Vault-Token: root" http://localhost:8200/v1/secret/data/db-pass
* Trying 127.0.0.1:8200...
* TCP_NODELAY set
* connect to 127.0.0.1 port 8200 failed: Connection refused
* Failed to connect to localhost port 8200: Connection refused
* Closing connection 0
curl: (7) Failed to connect to localhost port 8200: Connection refused
At this point I am a bit lost. I am not sure that the REST call is configured correctly, i.e. in such a way that Vault will accept it; but I am also not sure how to configure it differently.
The Vault logs show the information below, so I seems that the port and token I use are correct:
2023-02-14 09:07:14 You may need to set the following environment variables:
2023-02-14 09:07:14 $ export VAULT_ADDR='http://[::]:8200'
2023-02-14 09:07:14 The root token is displayed below
2023-02-14 09:07:14 Root Token: root
Vault seems to be running fine in Kubernetes:
kubectl get pods
NAME READY STATUS RESTARTS AGE
vault-0 1/1 Running 1 (22m ago) 32m
vault-agent-injector-77fd4cb69f-mf66p 1/1 Running 1 (22m ago) 32m
If I try to show the Vault status:
vault status
Error checking seal status: Get "http://[::]:8200/v1/sys/seal-status": dial tcp [::]:8200: connect: connection refused
I don't think the Vault is sealed, but if I try to unseal it:
vault operator unseal
Unseal Key (will be hidden):
Error unsealing: Put "http://[::]:8200/v1/sys/unseal": dial tcp [::]:8200: connect: connection refused
Any ideas?
As pertains to the tutorial, it works. Not sure what I was doing wrong, but I ran it all again and it worked. If I had to guess, I would suspect that some of the YAML involved in configuring the pods got malformed (since white space is significant).
The vault status command works, but only from a terminal running inside the Vault pod. The Kubernetes-in-Docker-on-DockerDesktop cluster does not expose any ports for these pods, so even though I have vault-cli installed on my PC, I cannot use vault status from outside the pods.
I have setup kubenertes on ubuntu server using this link.
Then I installed kubernetes dashboard using:
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-rc6/aio/deploy/recommended.yaml
Then I changed the ClusterIP to NodePort 32323, the service to NodePort.
But container is not running.
uday#dockermaster:~$ kubectl -n kubernetes-dashboard get all
NAME READY STATUS RESTARTS AGE
pod/dashboard-metrics-scraper-779f5454cb-pqfrj 1/1 Running 0 50m
pod/kubernetes-dashboard-64686c4bf9-5jkwq 0/1 CrashLoopBackOff 14 50m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/dashboard-metrics-scraper ClusterIP 10.103.22.252 <none> 8000/TCP 50m
service/kubernetes-dashboard NodePort 10.102.48.80 <none> 443:32323/TCP 50m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/dashboard-metrics-scraper 1/1 1 1 50m
deployment.apps/kubernetes-dashboard 0/1 1 0 50m
NAME DESIRED CURRENT READY AGE
replicaset.apps/dashboard-metrics-scraper-779f5454cb 1 1 1 50m
replicaset.apps/kubernetes-dashboard-64686c4bf9 1 1 0 50m
uday#dockermaster:~$ kubectl -n kubernetes-dashboard describe svc kubernetes-dashboard
Name: kubernetes-dashboard
Namespace: kubernetes-dashboard
Labels: k8s-app=kubernetes-dashboard
Annotations: Selector: k8s-app=kubernetes-dashboard
Type: NodePort
IP: 10.102.48.80
Port: <unset> 443/TCP
TargetPort: 8443/TCP
NodePort: <unset> 32323/TCP
Endpoints:
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>
Other apps are working fine with NodePort, whether Tomcat/nginx/databases.
But here, it is failing with container creation.
C:\Users\uday\Desktop>kubectl.exe get pods -n kubernetes-dashboard
NAME READY STATUS RESTARTS AGE
dashboard-metrics-scraper-779f5454cb-pqfrj 1/1 Running 1 20h
kubernetes-dashboard-64686c4bf9-g9z2k 0/1 CrashLoopBackOff 84 18h
C:\Users\uday\Desktop>kubectl.exe describe pod kubernetes-dashboard-64686c4bf9-g9z2k -n kubernetes-dashboard
Name: kubernetes-dashboard-64686c4bf9-g9z2k
Namespace: kubernetes-dashboard
Priority: 0
Node: slave-node/10.0.0.6
Start Time: Sat, 28 Mar 2020 14:16:54 +0000
Labels: k8s-app=kubernetes-dashboard
pod-template-hash=64686c4bf9
Annotations: <none>
Status: Running
IP: 182.244.1.12
IPs:
IP: 182.244.1.12
Controlled By: ReplicaSet/kubernetes-dashboard-64686c4bf9
Containers:
kubernetes-dashboard:
Container ID: docker://470ee8c61998c3c3dda86c58ad17817468f55aa73cd4feecf3b018977ce13ca3
Image: kubernetesui/dashboard:v2.0.0-rc6
Image ID: docker-pullable://kubernetesui/dashboard#sha256:61f9c378c427a3f8a9643f83baa9f96db1ae1357c67a93b533ae7b36d71c69dc
Port: 8443/TCP
Host Port: 0/TCP
Args:
--auto-generate-certificates
--namespace=kubernetes-dashboard
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 2
Started: Sun, 29 Mar 2020 09:01:31 +0000
Finished: Sun, 29 Mar 2020 09:02:01 +0000
Ready: False
Restart Count: 84
Liveness: http-get https://:8443/ delay=30s timeout=30s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/certs from kubernetes-dashboard-certs (rw)
/tmp from tmp-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kubernetes-dashboard-token-pzfbl (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kubernetes-dashboard-certs:
Type: Secret (a volume populated by a Secret)
SecretName: kubernetes-dashboard-certs
Optional: false
tmp-volume:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
kubernetes-dashboard-token-pzfbl:
Type: Secret (a volume populated by a Secret)
SecretName: kubernetes-dashboard-token-pzfbl
Optional: false
QoS Class: BestEffort
Node-Selectors: beta.kubernetes.io/os=linux
Tolerations: node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning BackOff 4m49s (x501 over 123m) kubelet, slave-node Back-off restarting failed container
kubectl.exe logs kubernetes-dashboard-64686c4bf9-g9z2k -n kubernetes-dashboard
2020/03/29 09:01:31 Starting overwatch
2020/03/29 09:01:31 Using namespace: kubernetes-dashboard
2020/03/29 09:01:31 Using in-cluster config to connect to apiserver
2020/03/29 09:01:31 Using secret token for csrf signing
2020/03/29 09:01:31 Initializing csrf token from kubernetes-dashboard-csrf secret
panic: Get https://10.96.0.1:443/api/v1/namespaces/kubernetes-dashboard/secrets/kubernetes-dashboard-csrf: dial tcp 10.96.0.1:443: i/o timeout
goroutine 1 [running]:
github.com/kubernetes/dashboard/src/app/backend/client/csrf.(*csrfTokenManager).init(0xc0004e2dc0)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/csrf/manager.go:40 +0x3b0
github.com/kubernetes/dashboard/src/app/backend/client/csrf.NewCsrfTokenManager(...)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/csrf/manager.go:65
github.com/kubernetes/dashboard/src/app/backend/client.(*clientManager).initCSRFKey(0xc00043ae80)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/manager.go:499 +0xc6
github.com/kubernetes/dashboard/src/app/backend/client.(*clientManager).init(0xc00043ae80)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/manager.go:467 +0x47
github.com/kubernetes/dashboard/src/app/backend/client.NewClientManager(...)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/manager.go:548
main.main()
/home/travis/build/kubernetes/dashboard/src/app/backend/dashboard.go:105 +0x20d
The Problem
The reason the application is not coming up is because the Dashboard container itself is not running. If you look at the output you provided you can see this:
pod/kubernetes-dashboard-64686c4bf9-5jkwq 0/1 CrashLoopBackOff 14
So how do we troubleshoot this? Well, there are three principle ways. One of which you'll probably be using more than the other two.
Describe
Describe is a is a command that allows you to fetch details about a resource in kubernetes. This could be metadata information, the amount of replicas you assigned, or even some events depicting why a resource is failing to start. For example, a referenced Container Image in your Pod manifest can not be found in the usable container registries. The syntax for using Describe is like so:
kubectl describe pod -n kubernetes-dashboard kubernetes-dashboard-64686c4bf9-5jkwq
Here are some great docs on the tool as well.
Logs
The next troubleshooting step you'll likely take advantage of in Kubernetes is using the logging architecture. As you're probably aware, when a Docker container is spawned it is common practice to have the logs that are produced by the application be redirected to STDOUT or STDERR for the process. Kubernetes them captures this log data for you and provides an API abstraction layer with which you can interact with it. Sometimes your Describe events won't have any indication of why a process isn't running. However, you can then proceed with grabbing logs from the process to determine what is going wrong. An example syntax might look like:
kubectl logs -f -n kubernetes-dashboard kubernetes-dashboard-64686c4bf9-5jkwq
Exec
The last common troubleshooting technique is Exec. Exec effectively allows you to attach to the a shell in a running container so that you can interact with the live environment to troubleshoot the application. This allows you to do things like see if configuration files were properly staged on the container's filesystem, determine if environment variables were properly expanded and set, etc. An example syntax for Exec might look like:
kubectl exec -it -n kubernetes-dashboard kubernetes-dashboard-64686c4bf9-5jkwq sh
In this case, however, your pod is in a CrashLoopBackoff state. This means that you will not be able to exec into it due to the fact that the container is not running. The Kubernetes API Server recognized a pattern of failures and automatically reduced its attempts at scheduling the workload accordingly.
Here is a good thread on how to troubleshoot pods that enter this state.
Summary
So, now that I've said all of this. How do we answer your question? Well, we can't answer it directly. But I sort of did with my summary above. Because the real answer you're looking for is how to properly troubleshoot linux containers running in Kubernetes. These issues will be a reoccurring theme in your experience with Kubernetes so it's essential to develop debugging skills in the ecosystem as soon as possible.
If the Describe, Logs, and Exec command are unable to help you find out why the Dashboard pod is failing to come up, feel free to add a comment on this answer requesting additional support and I'll be happy to help where I can!
I am having a networking issue in Kubernetes.
I am trying to preserve the source IP of incoming requests to a clusterIP service, but I find that the requests appear to be source NAT'd. That is, they carry the IP address of the node as the source IP rather than the IP of the pod making the request. I am following the example for cluster IPs here: https://kubernetes.io/docs/tutorials/services/source-ip/#source-ip-for-services-with-type-clusterip
but I find that the behavior of Kubernetes is totally different for me.
The above example has me deploy an echo server which reports the source IP. This is deployed behind a clusterIP service which I request from a separate pod running busybox. The response from the echo server is below:
CLIENT VALUES:
client_address=10.1.36.1
command=GET
real path=/
query=nil
request_version=1.1
request_uri=http://10.152.183.99:8080/
SERVER VALUES:
server_version=nginx: 1.10.0 - lua: 10001
HEADERS RECEIVED:
connection=close
host=10.152.183.99
user-agent=Wget
BODY
The source IP 10.1.36.1 belongs to the node. I expected to see the address of busybox which is 10.1.36.168.
Does anyone know why SNAT would be enabled for a clusterIP? It's really strange to me that this directly contradicts the official documentation. (edited)
All of this is running on the same node. The node is running in iptables mode. I am using microk8s.
My microk8s version:
Client:
Version: v1.2.5
Revision: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
Server:
Version: v1.2.5
Revision: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
Output of kubectl describe service clusterip:
Name: clusterip
Namespace: default
Labels: app=source-ip-app
Annotations: <none>
Selector: app=source-ip-app
Type: ClusterIP
IP: 10.152.183.106
Port: <unset> 80/TCP
TargetPort: 8080/TCP
Endpoints: 10.1.36.225:8080
Session Affinity: None
Events: <none>
Output of kubectl describe pod source-ip-app-7c79c78698-xgd5w:
Name: source-ip-app-7c79c78698-xgd5w
Namespace: default
Priority: 0
Node: riley-virtualbox/10.0.2.15
Start Time: Wed, 12 Feb 2020 09:19:18 -0600
Labels: app=source-ip-app
pod-template-hash=7c79c78698
Annotations: <none>
Status: Running
IP: 10.1.36.225
IPs:
IP: 10.1.36.225
Controlled By: ReplicaSet/source-ip-app-7c79c78698
Containers:
echoserver:
Container ID: containerd://6775c010145d3951d067e3bb062bea9b70d305f96f84aa870963a8b385a4a118
Image: k8s.gcr.io/echoserver:1.4
Image ID: sha256:523cad1a4df732d41406c9de49f932cd60d56ffd50619158a2977fd1066028f9
Port: <none>
Host Port: <none>
State: Running
Started: Wed, 12 Feb 2020 09:19:23 -0600
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-7pszf (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
default-token-7pszf:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-7pszf
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned default/source-ip-app-7c79c78698-xgd5w to riley-virtualbox
Normal Pulled 2m58s kubelet, riley-virtualbox Container image "k8s.gcr.io/echoserver:1.4" already present on machine
Normal Created 2m55s kubelet, riley-virtualbox Created container echoserver
Normal Started 2m54s kubelet, riley-virtualbox Started container echoserver
I'm wondering what steps to take when troubleshooting why a Google load balancer would see Nodes within a cluster as unhealthy?
Using Google Kubernetes, I have a cluster with 3 nodes, all deployments are running readiness and liveness checks. All are reporting that they are healthy.
The load balancer is built from the helm nginx-ingress:
https://github.com/helm/charts/tree/master/stable/nginx-ingress
It's used as a single ingress for all the deployment applications within the cluster.
Visually scanning the ingress controllers logs:
kubectl logs <ingress-controller-name>
shows only the usual nginx output ... HTTP/1.1" 200 ... I can't see any health checks within these logs. Not sure if I should but nothing to suggest anything is unhealhty.
Running a describe against the ingress controller shows no events, but it does show a liveness and readiness check which I'm not too sure would actually pass:
Name: umbrella-ingress-controller-****
Namespace: default
Priority: 0
PriorityClassName: <none>
Node: gke-multi-client-n1--2cpu-4ram-****/10.154.0.50
Start Time: Fri, 15 Nov 2019 21:23:36 +0000
Labels: app=ingress
component=controller
pod-template-hash=7c55db4f5c
release=umbrella
Annotations: kubernetes.io/limit-ranger: LimitRanger plugin set: cpu request for container ingress-controller
Status: Running
IP: ****
Controlled By: ReplicaSet/umbrella-ingress-controller-7c55db4f5c
Containers:
ingress-controller:
Container ID: docker://****
Image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.24.1
Image ID: docker-pullable://quay.io/kubernetes-ingress-controller/nginx-ingress-controller#sha256:****
Ports: 80/TCP, 443/TCP
Host Ports: 0/TCP, 0/TCP
Args:
/nginx-ingress-controller
--default-backend-service=default/umbrella-ingress-default-backend
--election-id=ingress-controller-leader
--ingress-class=nginx
--configmap=default/umbrella-ingress-controller
State: Running
Started: Fri, 15 Nov 2019 21:24:38 +0000
Ready: True
Restart Count: 0
Requests:
cpu: 100m
Liveness: http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
Environment:
POD_NAME: umbrella-ingress-controller-**** (v1:metadata.name)
POD_NAMESPACE: default (v1:metadata.namespace)
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from umbrella-ingress-token-**** (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
umbrella-ingress-token-2tnm9:
Type: Secret (a volume populated by a Secret)
SecretName: umbrella-ingress-token-****
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events: <none>
However, using Googles console, I navigate to the load balancers details and can see the following:
Above 2 of the nodes seem to be having issues, albeit I can't find the issues.
At this point the load balancer is still serving traffic via the third, healthy node, however it will occasionally drop that and show me the following:
At this point no traffic gets past the load balancer, so all the applications on the nodes are unreachable.
Any help with where I should be looking to troubleshoot this would be great.
---- edit 17/11/19
Below is the nginx-ingress config passed via helm:
ingress:
enabled: true
rbac.create: true
controller:
service:
externalTrafficPolicy: Local
loadBalancerIP: ****
configData:
proxy-connect-timeout: "15"
proxy-read-timeout: "600"
proxy-send-timeout: "600"
proxy-body-size: "100m"
This is expected behavior. Using externalTrafficPolicy: local configures the service so that only nodes where a serving pod exists will accept traffic. What this means is that any node that does not have a serving pod that receives traffic to the service will drop the packet.
The GCP Network Loadbalancer is still sending traffic to each node to test the health. The health check will use the service NodePort. Any node that contains nginx loadbalancer pods will respond to the health check. Any node that does not have an nginx load balancer pod will drop the packet so the check fails.
This results in only certain nodes showing as healthy.
For the nginx ingress controller, I recommend using the default value of cluster instead of changing it to local.
So I've been playing around with Minkube.
I've managed to deploy a simple python flask container:
PS C:\Users\Will> kubectl run test-flask-deploy --image
192.168.1.201:5000/test_flask:1
deployment "test-flask-deploy" created
I've also then managed to expose the deployment as a service:
PS C:\Users\Will> kubectl expose deployment/test-flask-deploy --
type="NodePort" --port 8080
service "test-flask-deploy" exposed
In the dashboard I can see that the service has a Cluster IP:
10.0.0.132.
I access the dashboard on a 192.168.xxx.xxx address, so I'm hoping I can expose the service on that external IP.
Any idea how I go about this?
A separate and slightly less important question: I've got minikube talking to a docker registry on my network. If i deploy an image (which has not yet been pulled local to the minikube) the deployment fails, yet when I run the docker pull command on minikube locally, the deployment then succeeds. So minikube is able to pull docker images, but when I deploy an image which is accessible via the registry, yet not pulled locally, it fails. Any thoughts?
EDIT: More detail in response to comment:
PS C:\Users\Will> kubectl describe pod test-flask-deploy
Name: test-flask-deploy-1049547027-rgf7d
Namespace: default
Node: minikube/192.168.99.100
Start Time: Sat, 07 Oct 2017 10:19:58 +0100
Labels: pod-template-hash=1049547027
run=test-flask-deploy
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"default","name":"test-flask-deploy-1049547027","uid":"b06a14b8-ab40-11e7-9714-080...
Status: Running
IP: 172.17.0.4
Created By: ReplicaSet/test-flask-deploy-1049547027
Controlled By: ReplicaSet/test-flask-deploy-1049547027
Containers:
test-flask-deploy:
Container ID: docker://577e339ce680bc5dd9388293f1f1ea62be59a6acc25be22889310761222c760f
Image: 192.168.1.201:5000/test_flask:1
Image ID: docker-pullable://192.168.1.201:5000/test_flask#sha256:d303ed635888394f69223cc0a66c5778444fd3636dfcde42295fd512be948898
Port: <none>
State: Running
Started: Sat, 07 Oct 2017 10:19:59 +0100
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-5rrpm (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
default-token-5rrpm:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-5rrpm
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: <none>
Events: <none>
First, check the nodeport that is assigned to your service:
$ kubectl get svc test-flask-deploy
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
test-flask-deploy 10.0.0.76 <nodes> 8080:30341/TCP 4m
Now you should be able to access it on 192.168.xxxx:30341 or whatever your minikubeIP:nodeport is.