I have setup kubenertes on ubuntu server using this link.
Then I installed kubernetes dashboard using:
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-rc6/aio/deploy/recommended.yaml
Then I changed the ClusterIP to NodePort 32323, the service to NodePort.
But container is not running.
uday#dockermaster:~$ kubectl -n kubernetes-dashboard get all
NAME READY STATUS RESTARTS AGE
pod/dashboard-metrics-scraper-779f5454cb-pqfrj 1/1 Running 0 50m
pod/kubernetes-dashboard-64686c4bf9-5jkwq 0/1 CrashLoopBackOff 14 50m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/dashboard-metrics-scraper ClusterIP 10.103.22.252 <none> 8000/TCP 50m
service/kubernetes-dashboard NodePort 10.102.48.80 <none> 443:32323/TCP 50m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/dashboard-metrics-scraper 1/1 1 1 50m
deployment.apps/kubernetes-dashboard 0/1 1 0 50m
NAME DESIRED CURRENT READY AGE
replicaset.apps/dashboard-metrics-scraper-779f5454cb 1 1 1 50m
replicaset.apps/kubernetes-dashboard-64686c4bf9 1 1 0 50m
uday#dockermaster:~$ kubectl -n kubernetes-dashboard describe svc kubernetes-dashboard
Name: kubernetes-dashboard
Namespace: kubernetes-dashboard
Labels: k8s-app=kubernetes-dashboard
Annotations: Selector: k8s-app=kubernetes-dashboard
Type: NodePort
IP: 10.102.48.80
Port: <unset> 443/TCP
TargetPort: 8443/TCP
NodePort: <unset> 32323/TCP
Endpoints:
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>
Other apps are working fine with NodePort, whether Tomcat/nginx/databases.
But here, it is failing with container creation.
C:\Users\uday\Desktop>kubectl.exe get pods -n kubernetes-dashboard
NAME READY STATUS RESTARTS AGE
dashboard-metrics-scraper-779f5454cb-pqfrj 1/1 Running 1 20h
kubernetes-dashboard-64686c4bf9-g9z2k 0/1 CrashLoopBackOff 84 18h
C:\Users\uday\Desktop>kubectl.exe describe pod kubernetes-dashboard-64686c4bf9-g9z2k -n kubernetes-dashboard
Name: kubernetes-dashboard-64686c4bf9-g9z2k
Namespace: kubernetes-dashboard
Priority: 0
Node: slave-node/10.0.0.6
Start Time: Sat, 28 Mar 2020 14:16:54 +0000
Labels: k8s-app=kubernetes-dashboard
pod-template-hash=64686c4bf9
Annotations: <none>
Status: Running
IP: 182.244.1.12
IPs:
IP: 182.244.1.12
Controlled By: ReplicaSet/kubernetes-dashboard-64686c4bf9
Containers:
kubernetes-dashboard:
Container ID: docker://470ee8c61998c3c3dda86c58ad17817468f55aa73cd4feecf3b018977ce13ca3
Image: kubernetesui/dashboard:v2.0.0-rc6
Image ID: docker-pullable://kubernetesui/dashboard#sha256:61f9c378c427a3f8a9643f83baa9f96db1ae1357c67a93b533ae7b36d71c69dc
Port: 8443/TCP
Host Port: 0/TCP
Args:
--auto-generate-certificates
--namespace=kubernetes-dashboard
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 2
Started: Sun, 29 Mar 2020 09:01:31 +0000
Finished: Sun, 29 Mar 2020 09:02:01 +0000
Ready: False
Restart Count: 84
Liveness: http-get https://:8443/ delay=30s timeout=30s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/certs from kubernetes-dashboard-certs (rw)
/tmp from tmp-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kubernetes-dashboard-token-pzfbl (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kubernetes-dashboard-certs:
Type: Secret (a volume populated by a Secret)
SecretName: kubernetes-dashboard-certs
Optional: false
tmp-volume:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
kubernetes-dashboard-token-pzfbl:
Type: Secret (a volume populated by a Secret)
SecretName: kubernetes-dashboard-token-pzfbl
Optional: false
QoS Class: BestEffort
Node-Selectors: beta.kubernetes.io/os=linux
Tolerations: node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning BackOff 4m49s (x501 over 123m) kubelet, slave-node Back-off restarting failed container
kubectl.exe logs kubernetes-dashboard-64686c4bf9-g9z2k -n kubernetes-dashboard
2020/03/29 09:01:31 Starting overwatch
2020/03/29 09:01:31 Using namespace: kubernetes-dashboard
2020/03/29 09:01:31 Using in-cluster config to connect to apiserver
2020/03/29 09:01:31 Using secret token for csrf signing
2020/03/29 09:01:31 Initializing csrf token from kubernetes-dashboard-csrf secret
panic: Get https://10.96.0.1:443/api/v1/namespaces/kubernetes-dashboard/secrets/kubernetes-dashboard-csrf: dial tcp 10.96.0.1:443: i/o timeout
goroutine 1 [running]:
github.com/kubernetes/dashboard/src/app/backend/client/csrf.(*csrfTokenManager).init(0xc0004e2dc0)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/csrf/manager.go:40 +0x3b0
github.com/kubernetes/dashboard/src/app/backend/client/csrf.NewCsrfTokenManager(...)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/csrf/manager.go:65
github.com/kubernetes/dashboard/src/app/backend/client.(*clientManager).initCSRFKey(0xc00043ae80)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/manager.go:499 +0xc6
github.com/kubernetes/dashboard/src/app/backend/client.(*clientManager).init(0xc00043ae80)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/manager.go:467 +0x47
github.com/kubernetes/dashboard/src/app/backend/client.NewClientManager(...)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/manager.go:548
main.main()
/home/travis/build/kubernetes/dashboard/src/app/backend/dashboard.go:105 +0x20d
The Problem
The reason the application is not coming up is because the Dashboard container itself is not running. If you look at the output you provided you can see this:
pod/kubernetes-dashboard-64686c4bf9-5jkwq 0/1 CrashLoopBackOff 14
So how do we troubleshoot this? Well, there are three principle ways. One of which you'll probably be using more than the other two.
Describe
Describe is a is a command that allows you to fetch details about a resource in kubernetes. This could be metadata information, the amount of replicas you assigned, or even some events depicting why a resource is failing to start. For example, a referenced Container Image in your Pod manifest can not be found in the usable container registries. The syntax for using Describe is like so:
kubectl describe pod -n kubernetes-dashboard kubernetes-dashboard-64686c4bf9-5jkwq
Here are some great docs on the tool as well.
Logs
The next troubleshooting step you'll likely take advantage of in Kubernetes is using the logging architecture. As you're probably aware, when a Docker container is spawned it is common practice to have the logs that are produced by the application be redirected to STDOUT or STDERR for the process. Kubernetes them captures this log data for you and provides an API abstraction layer with which you can interact with it. Sometimes your Describe events won't have any indication of why a process isn't running. However, you can then proceed with grabbing logs from the process to determine what is going wrong. An example syntax might look like:
kubectl logs -f -n kubernetes-dashboard kubernetes-dashboard-64686c4bf9-5jkwq
Exec
The last common troubleshooting technique is Exec. Exec effectively allows you to attach to the a shell in a running container so that you can interact with the live environment to troubleshoot the application. This allows you to do things like see if configuration files were properly staged on the container's filesystem, determine if environment variables were properly expanded and set, etc. An example syntax for Exec might look like:
kubectl exec -it -n kubernetes-dashboard kubernetes-dashboard-64686c4bf9-5jkwq sh
In this case, however, your pod is in a CrashLoopBackoff state. This means that you will not be able to exec into it due to the fact that the container is not running. The Kubernetes API Server recognized a pattern of failures and automatically reduced its attempts at scheduling the workload accordingly.
Here is a good thread on how to troubleshoot pods that enter this state.
Summary
So, now that I've said all of this. How do we answer your question? Well, we can't answer it directly. But I sort of did with my summary above. Because the real answer you're looking for is how to properly troubleshoot linux containers running in Kubernetes. These issues will be a reoccurring theme in your experience with Kubernetes so it's essential to develop debugging skills in the ecosystem as soon as possible.
If the Describe, Logs, and Exec command are unable to help you find out why the Dashboard pod is failing to come up, feel free to add a comment on this answer requesting additional support and I'll be happy to help where I can!
Related
I build a k8s cluster on my virtual Machines(CentOS/7) with Virtual Box:
kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8s-master Ready control-plane,master 8d v1.21.2 192.168.0.186 <none> CentOS Linux 7 (Core) 3.10.0-1160.31.1.el7.x86_64 docker://20.10.7
k8s-worker01 Ready <none> 8d v1.21.2 192.168.0.187 <none> CentOS Linux 7 (Core) 3.10.0-1160.31.1.el7.x86_64 docker://20.10.7
k8s-worker02 Ready <none> 8d v1.21.2 192.168.0.188 <none> CentOS Linux 7 (Core) 3.10.0-1160.31.1.el7.x86_64 docker://20.10.7
And i run some pods on the default namespace with a ReplicaSet several days before.
They were all worked fine at first, and then I shut down the VM.
Today, after I restarted the VMs, I found that they are not working properly anymore:
kubectl get all
NAME READY STATUS RESTARTS AGE
pod/dnsutils 1/1 Running 3 5d13h
pod/kubapp-6qbfz 0/1 Running 0 5d13h
pod/kubapp-d887h 0/1 Running 0 5d13h
pod/kubapp-z6nw7 0/1 Running 0 5d13h
NAME DESIRED CURRENT READY AGE
replicaset.apps/kubapp 3 3 0 5d13h
Then I delete the ReplicaSet and re-create it to create the pods.
And i run the command to get more infomations:
[root#k8s-master ch04]# kubectl describe po kubapp-z887v
Name: kubapp-d887h
Namespace: default
Priority: 0
Node: k8s-worker02/192.168.0.188
Start Time: Fri, 23 Jul 2021 15:55:16 +0000
Labels: app=kubapp
Annotations: cni.projectcalico.org/podIP: 10.244.69.244/32
cni.projectcalico.org/podIPs: 10.244.69.244/32
Status: Running
IP: 10.244.69.244
IPs:
IP: 10.244.69.244
Controlled By: ReplicaSet/kubapp
Containers:
kubapp:
Container ID: docker://fc352ce4c6a826f2cf108f9bb9a335e3572509fd5ae2002c116e2b080df5ee10
Image: evalle/kubapp
Image ID: docker-pullable://evalle/kubapp#sha256:560c9c50b1d894cf79ac472a9925dc795b116b9481ec40d142b928a0e3995f4c
Port: <none>
Host Port: <none>
State: Running
Started: Fri, 23 Jul 2021 15:55:21 +0000
Ready: False
Restart Count: 0
Readiness: exec [ls /var/ready] delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-m9rwr (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-m9rwr:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 30m default-scheduler Successfully assigned default/kubapp-d887h to k8s-worker02
Normal Pulling 30m kubelet Pulling image "evalle/kubapp"
Normal Pulled 30m kubelet Successfully pulled image "evalle/kubapp" in 4.049160061s
Normal Created 30m kubelet Created container kubapp
Normal Started 30m kubelet Started container kubapp
Warning Unhealthy 11s (x182 over 30m) kubelet Readiness probe failed: ls: cannot access /var/ready: No such file or directory
I don`t know what it happens and how i should do for fix it.
SO here i am and ask to you guys for help.
I am a k8s newbie,just give a hand please.
Thanks for paul-becotte`s help and recommendation.I think i should to post the definition of the pod:
apiVersion: apps/v1
kind: ReplicaSet
metadata:
# here is the name of the replication controller (RC)
name: kubapp
spec:
replicas: 3
# what pods the RC is operating on
selector:
matchLabels:
app: kubapp
# the pod template for creating new pods
template:
metadata:
labels:
app: kubapp
spec:
containers:
- name: kubapp
image: evalle/kubapp
readinessProbe:
exec:
command:
- ls
- /var/ready
There is a example definition of yaml from https://github.com/Evalle/k8s-in-action/blob/master/Chapter_4/kubapp-rs.yaml.
I don`t know where to find the dockerfile of the image evalle/kubapp.
And I don't know if it has the /var/ready directory.
Look at your event
Warning Unhealthy 11s (x182 over 30m) kubelet Readiness probe failed: ls: cannot access /var/ready: No such file or directory
Your readiness probe is failing- looks like it is checking for the existence of a file at /var/ready.
Your next step is "does that make sense? Is my container going to actually write a file at /var/ready when its ready?" If so, you'll want to look at the logs from your pod and figure out why its not writing the file. If its NOT the correct check, look at the yaml you used to create your pod/deployment/replicaset whatever and replace that check with something that does make sense.
We are using client-go to create kubernetes jobs and deployments. Today in one of our cluster (kubernetes v1.18.19), I encounter below weird problem.
Pods of kubernetes Job are always stuck in Pending status, without any reasons. kubectl describe pod shows there are no events. Creating Jobs from host (via kubectl) are normal and pods became running eventually.
What surprises me is Creating Deployments is ok, pods get running eventually!! It won't work only for Kubernetes Jobs. Why? How to fix that?? What I can do?? I have taken hours here but got no progress.
kubeconfig by client-go:
Mount from host machine, path: /root/.kube/config
kubectl describe job shows:
Name: unittest
Namespace: default
Selector: controller-uid=f3cec901-c0f4-4098-86d7-f9a7d1fe6cd1
Labels: job-id=unittest
Annotations: <none>
Parallelism: 1
Completions: 1
Start Time: Sat, 19 Jun 2021 00:20:12 +0800
Pods Statuses: 1 Running / 0 Succeeded / 0 Failed
Pod Template:
Labels: controller-uid=f3cec901-c0f4-4098-86d7-f9a7d1fe6cd1
job-name=unittest
Containers:
unittest:
Image: ubuntu:18.04
Port: <none>
Host Port: <none>
Command:
echo hello
Environment: <none>
Mounts: <none>
Volumes: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 21m job-controller Created pod: unittest-tt5b2
Kubectl describe on target pod shows:
Name: unittest-tt5b2
Namespace: default
Priority: 0
Node: <none>
Labels: controller-uid=f3cec901-c0f4-4098-86d7-f9a7d1fe6cd1
job-name=unittest
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: Job/unittest
Containers:
unittest:
Image: ubuntu:18.04
Port: <none>
Host Port: <none>
Command:
echo hello
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-72g27 (ro)
Volumes:
default-token-72g27:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-72g27
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events: <none>
kubectl get events shows:
55m Normal ScalingReplicaSet deployment/job-scheduler Scaled up replica set job-scheduler-76b7465d74 to 1
19m Normal ScalingReplicaSet deployment/job-scheduler Scaled up replica set job-scheduler-74f8896f48 to 1
58m Normal SuccessfulCreate job/unittest Created pod: unittest-pp665
49m Normal SuccessfulCreate job/unittest Created pod: unittest-xm6ck
17m Normal SuccessfulCreate job/unittest Created pod: unittest-tt5b2
I fixed the issue.
We use a custom scheduler for NPU devices and default scheduler for GPU devices. For GPU devices, the scheduler name is "default-scheduler" other than "default". I passed "default" for those kube Jobs, this causes the pods to stuck in pending.
GKE had an outage about 2 days ago in their London datacentre (https://status.cloud.google.com/incident/compute/20013), since which time one of my nodes has been acting up. I've had to manually terminate a number of pods running on it and I'm having issues with a couple of sites, I assume due to their liveness checks failing temporarily which might have something to do with the below error in gke-metrics-agent?
Looking at the system pods I can see one instance of gke-metrics-agent is stuck in a terminating state and has been since last night:
kubectl get pods -n kube-system
reports:
...
gke-metrics-agent-k47g8 0/1 Terminating 0 32d
gke-metrics-agent-knr9h 1/1 Running 0 31h
gke-metrics-agent-vqkpw 1/1 Running 0 32d
...
I've looked at the describe output for the pod but can't see anything that helps me understand what it needs done:
kubectl describe pod gke-metrics-agent-k47g8 -n kube-system
Name: gke-metrics-agent-k47g8
Namespace: kube-system
Priority: 2000001000
Priority Class Name: system-node-critical
Node: <node-name>/<IP>
Start Time: Mon, 09 Nov 2020 03:41:14 +0000
Labels: component=gke-metrics-agent
controller-revision-hash=f8c5b8bfb
k8s-app=gke-metrics-agent
pod-template-generation=4
Annotations: components.gke.io/component-name: gke-metrics-agent
components.gke.io/component-version: 0.27.1
configHash: <config-hash>
Status: Terminating (lasts 15h)
Termination Grace Period: 30s
IP: <IP>
IPs:
IP: <IP>
Controlled By: DaemonSet/gke-metrics-agent
Containers:
gke-metrics-agent:
Container ID: docker://<id>
Image: gcr.io/gke-release/gke-metrics-agent:0.1.3-gke.0
Image ID: docker-pullable://gcr.io/gke-release/gke-metrics-agent#sha256:<hash>
Port: <none>
Host Port: <none>
Command:
/otelsvc
--config=/conf/gke-metrics-agent-config.yaml
--metrics-level=NONE
State: Terminated
Reason: Completed
Exit Code: 0
Started: Mon, 09 Nov 2020 03:41:17 +0000
Finished: Thu, 10 Dec 2020 21:16:50 +0000
Ready: False
Restart Count: 0
Limits:
memory: 50Mi
Requests:
cpu: 3m
memory: 50Mi
Environment:
NODE_NAME: (v1:spec.nodeName)
POD_NAME: gke-metrics-agent-k47g8 (v1:metadata.name)
POD_NAMESPACE: kube-system (v1:metadata.namespace)
KUBELET_HOST: 127.0.0.1
ARG1: ${1}
ARG2: ${2}
Mounts:
/conf from gke-metrics-agent-config-vol (rw)
/etc/ssl/certs from ssl-certs (ro)
/var/run/secrets/kubernetes.io/serviceaccount from gke-metrics-agent-token-cn6ss (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
gke-metrics-agent-config-vol:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: gke-metrics-agent-conf
Optional: false
ssl-certs:
Type: HostPath (bare host directory volume)
Path: /etc/ssl/certs
HostPathType:
gke-metrics-agent-token-cn6ss:
Type: Secret (a volume populated by a Secret)
SecretName: gke-metrics-agent-token-cn6ss
Optional: false
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: :NoExecute
:NoSchedule
components.gke.io/gke-managed-components
node.kubernetes.io/disk-pressure:NoSchedule
node.kubernetes.io/memory-pressure:NoSchedule
node.kubernetes.io/network-unavailable:NoSchedule
node.kubernetes.io/not-ready:NoExecute
node.kubernetes.io/pid-pressure:NoSchedule
node.kubernetes.io/unreachable:NoExecute
node.kubernetes.io/unschedulable:NoSchedule
Events: <none>
I'm not used to having to work on the system pods, in the past my experience troubleshooting issues often falls back on force deleting them when all else fails:
kubectl delete pod <pod-name> -n <ns> --grace-period=0 --force
My concern is I don't fully understand what this might do for a system pod and was hoping someone with expertise could advise on a sensible way forward?
I'm also looking at draining this node so Kubernetes can rebuild a new one. Would this potentially be the easiest way to go?
Following up on this I found the pod that was experiencing the issues with gke-metrics-agent became even less stable as the day went on.
I, therefore, had to drain it. The resources it was running are now on new nodes which are working as expected and all system pods are running as expected (including gke-metrics-agent).
Prior to draining this node I ensured, Pod Disruption Budgets were in place as a number of services run on 1 or 2 instances:
https://kubernetes.io/docs/tasks/run-application/configure-pdb/
This meant I could run:
kubectl drain <node-name>
The deployments then ensured they had enough live pods prior to the bad node being taken offline and seems to have avoided any downtime.
I'm working on a local kubernetes installation with three nodes. They are installed via geerlingguy/kubernetes Ansible role (with default settings). I've recreated the whole VMs multiple times. I try to follow the Kubernetes tutorials on https://kubernetes.io/docs/tutorials/kubernetes-basics/explore/explore-interactive/ to get services up and running inside the cluster and try to reach them now.
# kubectl get nodes
NAME STATUS ROLES AGE VERSION
enceladus Ready <none> 162m v1.17.9
mimas Ready <none> 162m v1.17.9
titan Ready master 162m v1.17.9
I tried it with the 1.17.9 or 1.18.6, I tried it with https://github.com/geerlingguy/ansible-role-kubernetes and https://github.com/kubernetes-sigs/kubespray on fresh Debian-Buster VMs. I tried it with Flannel and Calico network plugin. There is no a firewall configured.
I can deploy the kubernetes-bootcamp and exec into it, but when I try to reach the pod via kubectl proxy and curl I'm getting an error.
# kubectl create deployment kubernetes-bootcamp --image=gcr.io/google-samples/kubernetes-bootcamp:v1
# kubectl describe pods
Name: kubernetes-bootcamp-69fbc6f4cf-nq4tj
Namespace: default
Priority: 0
Node: enceladus/192.168.10.12
Start Time: Thu, 06 Aug 2020 10:53:34 +0200
Labels: app=kubernetes-bootcamp
pod-template-hash=69fbc6f4cf
Annotations: <none>
Status: Running
IP: 10.244.1.4
IPs:
IP: 10.244.1.4
Controlled By: ReplicaSet/kubernetes-bootcamp-69fbc6f4cf
Containers:
kubernetes-bootcamp:
Container ID: docker://77eae93ca1e6b574ef7b0623844374a5b2f3054075025492b708b23fc3474a45
Image: gcr.io/google-samples/kubernetes-bootcamp:v1
Image ID: docker-pullable://gcr.io/google-samples/kubernetes-bootcamp#sha256:0d6b8ee63bb57c5f5b6156f446b3bc3b3c143d233037f3a2f00e279c8fcc64af
Port: <none>
Host Port: <none>
State: Running
Started: Thu, 06 Aug 2020 10:53:35 +0200
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-kkcvk (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
default-token-kkcvk:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-kkcvk
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 10s default-scheduler Successfully assigned default/kubernetes-bootcamp-69fbc6f4cf-nq4tj to enceladus
Normal Pulled 9s kubelet, enceladus Container image "gcr.io/google-samples/kubernetes-bootcamp:v1" already present on machine
Normal Created 9s kubelet, enceladus Created container kubernetes-bootcamp
Normal Started 9s kubelet, enceladus Started container kubernetes-bootcamp
Update service list
# kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 4d20h
I can exec curl inside the deployment. It is running.
# kubectl exec -ti kubernetes-bootcamp-69fbc6f4cf-nq4tj curl http://localhost:8080/
Hello Kubernetes bootcamp! | Running on: kubernetes-bootcamp-69fbc6f4cf-nq4tj | v=1
But, when I try to curl from master node the response is not good:
curl http://localhost:8001/api/v1/namespaces/default/pods/kubernetes-bootcamp-69fbc6f4cf-nq4tj/proxy/
Error trying to reach service: 'dial tcp 10.244.1.4:80: i/o timeout'
The curl itself needs ca. 30sec to return. The version etc. is available. The proxy is running fine.
# curl http://localhost:8001/version
{
"major": "1",
"minor": "17",
"gitVersion": "v1.17.9",
"gitCommit": "4fb7ed12476d57b8437ada90b4f93b17ffaeed99",
"gitTreeState": "clean",
"buildDate": "2020-07-15T16:10:45Z",
"goVersion": "go1.13.9",
"compiler": "gc",
"platform": "linux/amd64"
}
The tutorial shows on kubectl describe pods that the container has open ports (in my case it's <none>):
Port: 8080/TCP
Host Port: 0/TCP
Ok, I than created an apply-file bootcamp.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: kubernetes-bootcamp
spec:
replicas: 1
selector:
matchLabels:
app: kubernetes-bootcamp
template:
metadata:
labels:
app: kubernetes-bootcamp
spec:
containers:
- name: kubernetes-bootcamp
image: gcr.io/google-samples/kubernetes-bootcamp:v1
ports:
- containerPort: 8080
protocol: TCP
I removed the previous deployment
# kubectl delete deployments.apps kubernetes-bootcamp --force
# kubectl apply -f bootcamp.yaml
But after that I'm getting still the same i/o timeout on the new deployment.
So, what is my problem?
So I've been playing around with Minkube.
I've managed to deploy a simple python flask container:
PS C:\Users\Will> kubectl run test-flask-deploy --image
192.168.1.201:5000/test_flask:1
deployment "test-flask-deploy" created
I've also then managed to expose the deployment as a service:
PS C:\Users\Will> kubectl expose deployment/test-flask-deploy --
type="NodePort" --port 8080
service "test-flask-deploy" exposed
In the dashboard I can see that the service has a Cluster IP:
10.0.0.132.
I access the dashboard on a 192.168.xxx.xxx address, so I'm hoping I can expose the service on that external IP.
Any idea how I go about this?
A separate and slightly less important question: I've got minikube talking to a docker registry on my network. If i deploy an image (which has not yet been pulled local to the minikube) the deployment fails, yet when I run the docker pull command on minikube locally, the deployment then succeeds. So minikube is able to pull docker images, but when I deploy an image which is accessible via the registry, yet not pulled locally, it fails. Any thoughts?
EDIT: More detail in response to comment:
PS C:\Users\Will> kubectl describe pod test-flask-deploy
Name: test-flask-deploy-1049547027-rgf7d
Namespace: default
Node: minikube/192.168.99.100
Start Time: Sat, 07 Oct 2017 10:19:58 +0100
Labels: pod-template-hash=1049547027
run=test-flask-deploy
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"default","name":"test-flask-deploy-1049547027","uid":"b06a14b8-ab40-11e7-9714-080...
Status: Running
IP: 172.17.0.4
Created By: ReplicaSet/test-flask-deploy-1049547027
Controlled By: ReplicaSet/test-flask-deploy-1049547027
Containers:
test-flask-deploy:
Container ID: docker://577e339ce680bc5dd9388293f1f1ea62be59a6acc25be22889310761222c760f
Image: 192.168.1.201:5000/test_flask:1
Image ID: docker-pullable://192.168.1.201:5000/test_flask#sha256:d303ed635888394f69223cc0a66c5778444fd3636dfcde42295fd512be948898
Port: <none>
State: Running
Started: Sat, 07 Oct 2017 10:19:59 +0100
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-5rrpm (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
default-token-5rrpm:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-5rrpm
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: <none>
Events: <none>
First, check the nodeport that is assigned to your service:
$ kubectl get svc test-flask-deploy
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
test-flask-deploy 10.0.0.76 <nodes> 8080:30341/TCP 4m
Now you should be able to access it on 192.168.xxxx:30341 or whatever your minikubeIP:nodeport is.