kubernetes pending pod priority - kubernetes

I have the following pods on my kubernetes (1.18.3) cluster:
NAME READY STATUS RESTARTS AGE
pod1 1/1 Running 0 14m
pod2 1/1 Running 0 14m
pod3 0/1 Pending 0 14m
pod4 0/1 Pending 0 14m
pod3 and pod4 cannot start because the node has capacity for 2 pods only. When pod1 finishes and quits, then the scheduler picks either pod3 or pod4 and starts it. So far so good.
However, I also have a high priority pod (hpod) that I'd like to start before pod3 or pod4 when either of the running pods finishes and quits.
So I created a priorityclass can be found in the kubernetes docs:
kind: PriorityClass
metadata:
name: high-priority-no-preemption
value: 1000000
preemptionPolicy: Never
globalDefault: false
description: "This priority class should be used for XYZ service pods only."
I've created the following pod yaml:
apiVersion: v1
kind: Pod
metadata:
name: hpod
labels:
app: hpod
spec:
containers:
- name: hpod
image: ...
resources:
requests:
cpu: "500m"
memory: "500Mi"
limits:
cpu: "500m"
memory: "500Mi"
priorityClassName: high-priority-no-preemption
Now the problem is that when I start the high prio pod with kubectl apply -f hpod.yaml, then the scheduler terminates a running pod to allow the high priority pod to start despite I've set 'preemptionPolicy: Never'.
The expected behaviour would be to postpone starting hpod until a currently running pod finishes. And when it does, then let hpod start before pod3 or pod4.
What am I doing wrong?

Prerequisites:
This solution was tested on Kubernetes v1.18.3, docker 19.03 and Ubuntu 18.
Also text editor is required (i.e. sudo apt-get install vim).
In Kubernetes documentation under How to disable preemption you can find Note:
Note: In Kubernetes 1.15 and later, if the feature NonPreemptingPriority is enabled, PriorityClasses have the option to set preemptionPolicy: Never. This will prevent pods of that PriorityClass from preempting other pods.
Also under Non-preempting PriorityClass you have information:
The use of the PreemptionPolicy field requires the NonPreemptingPriority feature gate to be enabled.
Later if you will check thoses Feature Gates info, you will find that NonPreemptingPriority is false, so as default it's disabled.
Output with your current configuration:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-normal 1/1 Running 0 32s
nginx-normal-2 1/1 Running 0 32s
$ kubectl apply -f prio.yaml
pod/nginx-priority created$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-normal-2 1/1 Running 0 48s
nginx-priority 1/1 Running 0 8s
To enable preemptionPolicy: Never you need to apply --feature-gates=NonPreemptingPriority=true to 3 files:
/etc/kubernetes/manifests/kube-apiserver.yaml
/etc/kubernetes/manifests/kube-controller-manager.yaml
/etc/kubernetes/manifests/kube-scheduler.yaml
To check if this feature-gate is enabled you can check by using commands:
ps aux | grep apiserver | grep feature-gates
ps aux | grep scheduler | grep feature-gates
ps aux | grep controller-manager | grep feature-gates
For quite detailed information, why you have to edit thoses files please check this Github thread.
$ sudo su
# cd /etc/kubernetes/manifests/
# ls
etcd.yaml kube-apiserver.yaml kube-controller-manager.yaml kube-scheduler.yaml
Use your text editor to add feature gate to those files
# vi kube-apiserver.yaml
and add - --feature-gates=NonPreemptingPriority=true under spec.containers.command like in example bellow:
spec:
containers:
- command:
- kube-apiserver
- --feature-gates=NonPreemptingPriority=true
- --advertise-address=10.154.0.31
And do the same with 2 other files. After that you can check if this flags were applied.
$ ps aux | grep apiserver | grep feature-gates
root 26713 10.4 5.2 565416 402252 ? Ssl 14:50 0:17 kube-apiserver --feature-gates=NonPreemptingPriority=true --advertise-address=10.154.0.31
Now you have redeploy your PriorityClass.
$ kubectl get priorityclass
NAME VALUE GLOBAL-DEFAULT AGE
high-priority-no-preemption 1000000 false 12m
system-cluster-critical 2000000000 false 23m
system-node-critical 2000001000 false 23m
$ kubectl delete priorityclass high-priority-no-preemption
priorityclass.scheduling.k8s.io "high-priority-no-preemption" deleted
$ kubectl apply -f class.yaml
priorityclass.scheduling.k8s.io/high-priority-no-preemption created
Last step is to deploy pod with this PriorityClass.
TEST
$ kubectl get po
NAME READY STATUS RESTARTS AGE
nginx-normal 1/1 Running 0 4m4s
nginx-normal-2 1/1 Running 0 18m
$ kubectl apply -f prio.yaml
pod/nginx-priority created
$ kubectl get po
NAME READY STATUS RESTARTS AGE
nginx-normal 1/1 Running 0 5m17s
nginx-normal-2 1/1 Running 0 20m
nginx-priority 0/1 Pending 0 67s
$ kubectl delete po nginx-normal-2
pod "nginx-normal-2" deleted
$ kubectl get po
NAME READY STATUS RESTARTS AGE
nginx-normal 1/1 Running 0 5m55s
nginx-priority 1/1 Running 0 105s

Related

Kuberenetes Available schedulars

How would I display available schedulers in my cluster in order to use non default one using the schedulerName field?
Any link to a document describing how to "install" and use a custom scheduler is highly appreciated :)
Thx in advance
Schedulers can be found among your kube-system pods. You can then filter the output to your needs with kube-scheduler as the search key:
➜ ~ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-6955765f44-9wfkp 0/1 Completed 15 264d
coredns-6955765f44-jmz9j 1/1 Running 16 264d
etcd-acid-fuji 1/1 Running 17 264d
kube-apiserver-acid-fuji 1/1 Running 6 36d
kube-controller-manager-acid-fuji 1/1 Running 21 264d
kube-proxy-hs2qb 1/1 Running 0 177d
kube-scheduler-acid-fuji 1/1 Running 21 264d
You can retrieve the yaml file with:
➜ ~ kubectl get pods -n kube-system <scheduler pod name> -oyaml
If you bootstrapped your cluster with Kubeadm you may also find the yaml files in the /etc/kubernetes/manifests:
➜ manifests sudo cat /etc/kubernetes/manifests/kube-scheduler.yaml
---
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
component: kube-scheduler
tier: control-plane
name: kube-scheduler
namespace: kube-system
spec:
containers:
- command:
- kube-scheduler
- --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
- --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
- --bind-address=127.0.0.1
- --kubeconfig=/etc/kubernetes/scheduler.conf
- --leader-elect=true
image: k8s.gcr.io/kube-scheduler:v1.17.6
imagePullPolicy: IfNotPresent
---------
The location for minikube is similar but you do have to login in the minikube's virtual machine first with minikube ssh.
For more reading please have a look how to configure multiple schedulers and how to write custom schedulers.
You can try this one:
kubectl get pods --all-namespaces | grep scheduler

Debugging kubernetes pods in Completed status forever but not ready

Would there be any reason if I'm trying to run a pod on a k8s cluster which stays in the 'Completed' status forever but is never in ready status 'Ready 0/1' ... although there are core-dns , kube-proxy pods,etc running successfully scheduled under each node in the nodepool assigned to a k8s cluster ... all the worker nodes seems to be in healthy state
This sounds like the pod lifecycle has ended and the reasons is probably because your pod have finished the task it is meant for.
Something like next example will not do anything, it will be created successfully then started, will pull the image and then it will be marked as completed.
apiVersion: v1
kind: Pod
metadata:
name: my-pod
spec:
containers:
- name: test-pod
image: busybox
resources:
Here is how it will look:
NAMESPACE NAME READY STATUS RESTARTS AGE
default my-pod 0/1 Completed 1 3s
default testapp-1 0/1 Pending 0 92m
default web-0 1/1 Running 0 3h2m
kube-system event-exporter-v0.2.5-7df89f4b8f-mc7kt 2/2 Running 0 7d19h
kube-system fluentd-gcp-scaler-54ccb89d5-9pp8z 1/1 Running 0 7d19h
kube-system fluentd-gcp-v3.1.1-gdr9l 2/2 Running 0 123m
kube-system fluentd-gcp-v3.1.1-pt8zp 2/2 Running 0 3h2m
kube-system fluentd-gcp-v3.1.1-xwhkn 2/2 Running 5 172m
kube-system fluentd-gcp-v3.1.1-zh29w 2/2 Running 0 7d19h
For this case, I will recommend you to check your yaml, check what kind of pod you are running? and what is it meant for?
Even, and with test purposes, you can add a command argument to keep it running.
command: ["/bin/sh","-c"]
args: ["command one; command two && command three"]
something like:
args: ["-c", "while true; do echo hello; sleep 10;done"]
The yaml with commands added would look like this:
apiVersion: v1
kind: Pod
metadata:
name: my-pod
spec:
containers:
- name: test-pod
image: busybox
resources:
command: ["/bin/sh","-c"]
args: ["-c", "while true; do echo hello; sleep 10;done"]
Here is how it will look:
NAMESPACE NAME READY STATUS RESTARTS AGE
default my-pod 1/1 Running 0 8m56s
default testapp-1 0/1 Pending 0 90m
default web-0 1/1 Running 0 3h
kube-system event-exporter-v0.2.5-7df89f4b8f-mc7kt 2/2 Running 0 7d19h
kube-system fluentd-gcp-scaler-54ccb89d5-9pp8z 1/1 Running 0 7d19h
kube-system fluentd-gcp-v3.1.1-gdr9l 2/2 Running 0 122m
Another thing that will help is a kubectl describe pod $POD_NAME, to analyze this further.
This issue got nothing to do with k8s cluster or setup. It's mostly with your Dockerfile.
For the pod to be in running state, you need to have the container as an executable, for that usually an Entrypoint that specifies the instructions would solve the problem you are facing.
A better approach to resolve this or debug this particular issue would be, by running the respective docker container on a localmachine before deploying that onto k8s cluster.
The message Completed, most likely implies that the k8s started your container, then the container subsequently exited.
In Dockerfile, the ENTRYPOINT defines the process with pid 1 which the docker container should hold, otherwise it terminates, so whatever the pod is trying to do, you need to check whether the ENTRYPOINT command is looping, and doesn't die.
For side notes, Kubernetes will try to restart the pod, once it terminates but after few tries the pod status may turn into CrashLoopBackOff and you will see message something similar to Back-off restarting failed container.

Kubernetes coredns pods stuck in Pending status. Cannot start the dashboard

I am building a Kubernetes cluster following this tutorial, and I have troubles to access the Kubernetes dashboard. I already created another question about it that you can see here, but while digging up into my cluster, I think that the problem might be somewhere else and that's why I create a new question.
I start my master, by running the following commands:
> kubeadm reset
> kubeadm init --apiserver-advertise-address=[MASTER_IP] > file.txt
> tail -2 file.txt > join.sh # I keep this file for later
> kubectl apply -f https://git.io/weave-kube/
> kubectl -n kube-system get pod
NAME READY STATUS RESTARTS AGE
coredns-fb8b8dccf-kb2zq 0/1 Pending 0 2m46s
coredns-fb8b8dccf-nnc5n 0/1 Pending 0 2m46s
etcd-kubemaster 1/1 Running 0 93s
kube-apiserver-kubemaster 1/1 Running 0 93s
kube-controller-manager-kubemaster 1/1 Running 0 113s
kube-proxy-lxhvs 1/1 Running 0 2m46s
kube-scheduler-kubemaster 1/1 Running 0 93s
Here we can see that I have two coredns pods stuck in Pending state forever, and when I run the command :
> kubectl -n kube-system describe pod coredns-fb8b8dccf-kb2zq
I can see in the Events part the following Warning :
Failed Scheduling : 0/1 nodes are available 1 node(s) had taints that the pod didn't tolerate.
Since it is a Warning and not and Error, and that as a Kubernetes newbie, taints does not mean much to me, I tried to connect a node to the master (using the previously saved command) :
> cat join.sh
kubeadm join [MASTER_IP]:6443 --token [TOKEN] \
--discovery-token-ca-cert-hash sha256:[ANOTHER_TOKEN]
> ssh [USER]#[WORKER_IP] 'bash' < join.sh
This node has joined the cluster.
On the master, I check that the node is connected:
> kubectl get nodes
NAME STATUS ROLES AGE VERSION
kubemaster NotReady master 13m v1.14.1
kubeslave1 NotReady <none> 31s v1.14.1
And I check my pods :
> kubectl -n kube-system get pod
NAME READY STATUS RESTARTS AGE
coredns-fb8b8dccf-kb2zq 0/1 Pending 0 14m
coredns-fb8b8dccf-nnc5n 0/1 Pending 0 14m
etcd-kubemaster 1/1 Running 0 13m
kube-apiserver-kubemaster 1/1 Running 0 13m
kube-controller-manager-kubemaster 1/1 Running 0 13m
kube-proxy-lxhvs 1/1 Running 0 14m
kube-proxy-xllx4 0/1 ContainerCreating 0 2m16s
kube-scheduler-kubemaster 1/1 Running 0 13m
We can see that another kube-proxy pod have been created and is stuck in ContainerCreating status.
And when I am doing a describe again :
kubectl -n kube-system describe pod kube-proxy-xllx4
I can see in the Events part multiple identical Warnings :
Failed create pod sandbox : rpx error: code = Unknown desc = failed pulling image "k8s.gcr.io/pause:3.1": Get https://k8s.gcr.io/v1/_ping: dial tcp: lookup k8s.gcr.io on [::1]:53 read up [::1]43133->[::1]:53: read: connection refused
Here are my repositories :
docker image ls
REPOSITORY TAG
k8s.gcr.io/kube-proxy v1.14.1
k8s.gcr.io/kube-apiserver v1.14.1
k8s.gcr.io/kube-controller-manager v1.14.1
k8s.gcr.io/kube-scheduler v1.14.1
k8s.gcr.io/coredns 1.3.1
k8s.gcr.io/etcd 3.3.10
k8s.gcr.io/pause 3.1
And so, for the dashboard part, I tried to start it with the command
> kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/master/aio/deploy/recommended/kubernetes-dashboard.yaml
But the dashboard pod is stuck in Pending state.
kubectl -n kube-system get pod
NAME READY STATUS RESTARTS AGE
coredns-fb8b8dccf-kb2zq 0/1 Pending 0 40m
coredns-fb8b8dccf-nnc5n 0/1 Pending 0 40m
etcd-kubemaster 1/1 Running 0 38m
kube-apiserver-kubemaster 1/1 Running 0 38m
kube-controller-manager-kubemaster 1/1 Running 0 39m
kube-proxy-lxhvs 1/1 Running 0 40m
kube-proxy-xllx4 0/1 ContainerCreating 0 27m
kube-scheduler-kubemaster 1/1 Running 0 38m
kubernetes-dashboard-5f7b999d65-qn8qn 1/1 Pending 0 8s
So, event though my problem originaly was that I cannot access to my dashboard, I guess that the real problem is deeper thant that.
I know that I just put a lot of information here, but I am a k8s beginner and I am completely lost on this.
There is an issue I experienced with coredns pods stuck in a pending mode when setting up your own cluster; which I resolve by adding pod network.
Looks like because there is no Network Addon installed, the nodes are taint as not-ready. Installing the Addon would remove the taints and the Pods will be able to schedule. In my case adding flannel fixed the issue.
EDIT: There is a note about this in the official k8s documentation - Create cluster with kubeadm:
The network must be deployed before any applications. Also, CoreDNS
will not start up before a network is installed. kubeadm only
supports Container Network Interface (CNI) based networks (and does
not support kubenet).
Actually it is the opposite of a deep or serious issue. This is a trivial issue. Always you see a pod stuck on Pending state, it means the scheduler is having a hard time to schedule the pod; mostly because there are no enough resources on the node.
In your case it is a taint that has the node, and your pod doesn't have the toleration. What you have to do is to describe the node and get the taint:
kubectl describe node | grep -i taints
Note: you might have more then one taint. So you might want to do kubectl describe no NODE since with grep you will only see one taint.
Once you get the taint, that will be something like hello=world:NoSchedule; which means key=value:effect, you will have to add a toleration section in your Deployment. This is an example Deployment so you can see how it should look like:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nginx
labels:
app: nginx
spec:
replicas: 10
strategy:
type: Recreate
template:
metadata:
labels:
app: nginx
spec:
containers:
- image: nginx
name: nginx
ports:
- containerPort: 80
name: http
tolerations:
- effect: NoExecute #NoSchedule, PreferNoSchedule
key: node
operator: Equal
value: not-ready
tolerationSeconds: 3600
As you can see there is the toleration section in the yaml. So, if I would have a node with node=not-ready:NoExecute taint, no pod would be able to be scheduled on that node, unless would have this toleration.
Also you can remove the taint, if you don need it. To remove a taint you would describe the node, get the key of the taint and do:
kubectl taint node NODE key-
Hope it makes sense. Just add this section to your deployment, and it will work.
Set up the flannel network tool.
Running commands:
$ sysctl net.bridge.bridge-nf-call-iptables=1
$ kubectl apply -f
https://raw.githubusercontent.com/coreos/flannel/62e44c867a2846fefb68bd5f178daf4da3095ccb/Documentation/kube-flannel.yml

Error from server (NotFound): podmetrics.metrics.k8s.io "mem-example/memory-demo" not found

I am following this tutorial: https://kubernetes.io/docs/tasks/configure-pod-container/assign-memory-resource/
I have created the memory pod demo and I am trying to get the metrics from the pod but it is not working.
I installed the metrics server by cloning: https://github.com/kubernetes-incubator/metrics-server
And then running this command from top level:
kubectl create -f deploy/1.8+/
I am using kubernetes version 1.10.11.
The pod is definitely created:
λ kubectl get pod memory-demo --namespace=mem-example
NAME READY STATUS RESTARTS AGE
memory-demo 1/1 Running 0 6m
But the metics command does not work and gives an error:
λ kubectl top pod memory-demo --namespace=mem-example
Error from server (NotFound): podmetrics.metrics.k8s.io "mem-example/memory-demo" not found
What did I do wrong?
There are some patches to be done to metrics server deployment to get the metrics working.
Follow the below steps
kubectl delete -f deploy/1.8+/
wait till the metrics server gets undeployed
run the below command
kubectl create -f https://raw.githubusercontent.com/epasham/docker-repo/master/k8s/metrics-server.yaml
master $ kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-78fcdf6894-6zg78 1/1 Running 0 2h
coredns-78fcdf6894-gk4sb 1/1 Running 0 2h
etcd-master 1/1 Running 0 2h
kube-apiserver-master 1/1 Running 0 2h
kube-controller-manager-master 1/1 Running 0 2h
kube-proxy-f5z9p 1/1 Running 0 2h
kube-proxy-ghbvn 1/1 Running 0 2h
kube-scheduler-master 1/1 Running 0 2h
metrics-server-85c54d44c8-rmvxh 2/2 Running 0 1m
weave-net-4j7cl 2/2 Running 1 2h
weave-net-82fzn 2/2 Running 1 2h
master $ kubectl top pod -n kube-system
NAME CPU(cores) MEMORY(bytes)
coredns-78fcdf6894-6zg78 2m 11Mi
coredns-78fcdf6894-gk4sb 2m 9Mi
etcd-master 14m 90Mi
kube-apiserver-master 24m 425Mi
kube-controller-manager-master 26m 62Mi
kube-proxy-f5z9p 2m 19Mi
kube-proxy-ghbvn 3m 17Mi
kube-scheduler-master 8m 14Mi
metrics-server-85c54d44c8-rmvxh 1m 19Mi
weave-net-4j7cl 2m 59Mi
weave-net-82fzn 1m 60Mi
Check and verify the below lines in metrics server deployment manifest.
command:
- /metrics-server
- --metric-resolution=30s
- --kubelet-preferred-address-types=InternalIP
- --kubelet-insecure-tls
On Minikube, I had to wait for 20-25 minutes after enabling the metrics-server addon. I was getting the same error for 20-25 minutes but later I could see the output without attempting for any solution.
I faced the similar issue of
Error from server (NotFound): podmetrics.metrics.k8s.io "default/apple-app" not found
I followed two steps and I was able to resolve the issue.
Download the latest customized components.yaml, which is their official file used for easy deployment.
Update the change
# - /metrics-server
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
to the command section of the deployment specification. I have commented the first line because it is the entrypoint of the image used by kubernetes metrics-server.
$ docker image inspect k8s.gcr.io/metrics-server-amd64:v0.3.6 -f {{.ContainerConfig.Entrypoint}}
[/metrics-server]
Even If you use it or not, it doesn't matter.
Note: You have to wait for few seconds for it to properly work.
After this running the top command will work for you.
$ kubectl top pod apple-app
NAME CPU(cores) MEMORY(bytes)
apple-app 1m 3Mi
I know this is an old thread may be someone will find this answer useful.
You have to checkout the following repo:
https://github.com/kubernetes-incubator/metrics-server
Go to the root of the repo and checkout release-0.3.2.
Remove default metrics server by:
kubectl delete -f deploy/1.8+/
Download the container yaml
wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml
Edit the container.yaml by adding the following lines to the argument section. You will see these two lines there
args:
- --kubelet-preferred-address-types=InternalIP
- --kubelet-insecure-tls=true
There is only one args parameter in that file.
Deploy your pod/deployment and you should be able to do:
kubectl top pod <pod-name>

no endpoints available for service \"kubernetes-dashboard\"

I'm trying to follow GitHub - kubernetes/dashboard: General-purpose web UI for Kubernetes clusters.
deploy/access:
# export KUBECONFIG=/etc/kubernetes/admin.conf
# kubectl create -f https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml
secret/kubernetes-dashboard-certs created
serviceaccount/kubernetes-dashboard created
role.rbac.authorization.k8s.io/kubernetes-dashboard-minimal created
rolebinding.rbac.authorization.k8s.io/kubernetes-dashboard-minimal created
deployment.apps/kubernetes-dashboard created
service/kubernetes-dashboard created
# kubectl proxy
Starting to serve on 127.0.0.1:8001
curl:
# curl http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {
},
"status": "Failure",
"message": "no endpoints available for service \"kubernetes-dashboard\"",
"reason": "ServiceUnavailable",
"code": 503
}#
Please advise.
per #VKR
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-576cbf47c7-56vg7 0/1 ContainerCreating 0 57m
kube-system coredns-576cbf47c7-sn2fk 0/1 ContainerCreating 0 57m
kube-system etcd-wcmisdlin02.uftwf.local 1/1 Running 0 56m
kube-system kube-apiserver-wcmisdlin02.uftwf.local 1/1 Running 0 56m
kube-system kube-controller-manager-wcmisdlin02.uftwf.local 1/1 Running 0 56m
kube-system kube-proxy-2hhf7 1/1 Running 0 6m57s
kube-system kube-proxy-lzfcx 1/1 Running 0 7m35s
kube-system kube-proxy-rndhm 1/1 Running 0 57m
kube-system kube-scheduler-wcmisdlin02.uftwf.local 1/1 Running 0 56m
kube-system kubernetes-dashboard-77fd78f978-g2hts 0/1 Pending 0 2m38s
$
logs:
$ kubectl logs kubernetes-dashboard-77fd78f978-g2hts -n kube-system
$
describe:
$ kubectl describe pod kubernetes-dashboard-77fd78f978-g2hts -n kube-system
Name: kubernetes-dashboard-77fd78f978-g2hts
Namespace: kube-system
Priority: 0
PriorityClassName: <none>
Node: <none>
Labels: k8s-app=kubernetes-dashboard
pod-template-hash=77fd78f978
Annotations: <none>
Status: Pending
IP:
Controlled By: ReplicaSet/kubernetes-dashboard-77fd78f978
Containers:
kubernetes-dashboard:
Image: k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.0
Port: 8443/TCP
Host Port: 0/TCP
Args:
--auto-generate-certificates
Liveness: http-get https://:8443/ delay=30s timeout=30s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/certs from kubernetes-dashboard-certs (rw)
/tmp from tmp-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kubernetes-dashboard-token-gp4l7 (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
kubernetes-dashboard-certs:
Type: Secret (a volume populated by a Secret)
SecretName: kubernetes-dashboard-certs
Optional: false
tmp-volume:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
kubernetes-dashboard-token-gp4l7:
Type: Secret (a volume populated by a Secret)
SecretName: kubernetes-dashboard-token-gp4l7
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 4m39s (x21689 over 20h) default-scheduler 0/3 nodes are available: 3 node(s) had taints that the pod didn't tolerate.
$
It would appear that you are attempting to deploy Kubernetes leveraging kubeadm but have skipped the step of Installing a pod network add-on (CNI). Notice the warning:
The network must be deployed before any applications. Also, CoreDNS will not start up before a network is installed. kubeadm only supports Container Network Interface (CNI) based networks (and does not support kubenet).
Once you do this, the CoreDNS pods should come up healthy. This can be verified with:
kubectl -n kube-system -l=k8s-app=kube-dns get pods
Then the kubernetes-dashboard pod should come up healthy as well.
you could refer to https://github.com/kubernetes/dashboard#getting-started
Also, I see "https" in your link
Please try this link instead
http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/
I had the same problem. In the end it turned out as a Calico Network configuration problem. But step by step...
First I checked if the Dashboard Pod was running:
kubectl get pods --all-namespaces
The result for me was:
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-bcc6f659f-j57l9 1/1 Running 2 19h
kube-system calico-node-hdxp6 0/1 CrashLoopBackOff 13 15h
kube-system calico-node-z6l56 0/1 Running 68 19h
kube-system coredns-74ff55c5b-8l6m6 1/1 Running 2 19h
kube-system coredns-74ff55c5b-v7pkc 1/1 Running 2 19h
kube-system etcd-got-virtualbox 1/1 Running 3 19h
kube-system kube-apiserver-got-virtualbox 1/1 Running 3 19h
kube-system kube-controller-manager-got-virtualbox 1/1 Running 3 19h
kube-system kube-proxy-q99s5 1/1 Running 2 19h
kube-system kube-proxy-vrpcd 1/1 Running 1 15h
kube-system kube-scheduler-got-virtualbox 1/1 Running 2 19h
kubernetes-dashboard dashboard-metrics-scraper-7b59f7d4df-qc9ms 1/1 Running 0 28m
kubernetes-dashboard kubernetes-dashboard-74d688b6bc-zrdk4 0/1 CrashLoopBackOff 9 28m
The last line indicates, that the dashboard pod could not have been started (status=CrashLoopBackOff).
And the 2nd line shows that the calico node has problems. Most likely the root cause is Calico.
Next step is to have a look at the pod log (change namespace / name as listed in YOUR pods list):
kubectl logs kubernetes-dashboard-74d688b6bc-zrdk4 -n kubernetes-dashboard
The result for me was:
2021/03/05 13:01:12 Starting overwatch
2021/03/05 13:01:12 Using namespace: kubernetes-dashboard
2021/03/05 13:01:12 Using in-cluster config to connect to apiserver
2021/03/05 13:01:12 Using secret token for csrf signing
2021/03/05 13:01:12 Initializing csrf token from kubernetes-dashboard-csrf secret
panic: Get https://10.96.0.1:443/api/v1/namespaces/kubernetes-dashboard/secrets/kubernetes-dashboard-csrf: dial tcp 10.96.0.1:443: i/o timeout
Hm - not really helpful. After searching for "dial tcp 10.96.0.1:443: i/o timeout" I found this information, where it says ...
If you follow the kubeadm instructions to the letter ... Which means install docker, kubernetes (kubeadm, kubectl, & kubelet), and calico with the Kubeadm hosted instructions ... and your computer nodes have a physical ip address in the range of 192.168.X.X then you will end up with the above mentioned non-working dashboard. This is because the node ip addresses clash with the internal calico ip addresses.
https://github.com/kubernetes/dashboard/issues/1578#issuecomment-329904648
Yes, in deed I do have a physical IP in the range of 192.168.x.x - like many others might have as well. I wish Calico would check this during setup.
So let's move the pod network to a different IP range:
You should use a classless reserved IP range for Private Networks like
10.0.0.0/8 (16.777.216 addresses)
172.16.0.0/12 (1.048.576 addresses)
192.168.0.0/16 (65.536 addresses). Otherwise Calico will terminate with an error saying "Invalid CIDR specified in CALICO_IPV4POOL_CIDR" ...
sudo kubeadm reset
sudo rm /etc/cni/net.d/10-calico.conflist
sudo rm /etc/cni/net.d/calico-kubeconfig
export CALICO_IPV4POOL_CIDR=172.16.0.0
export MASTER_IP=192.168.100.122
sudo kubeadm init --pod-network-cidr=$CALICO_IPV4POOL_CIDR/12 --apiserver-advertise-address=$MASTER_IP --apiserver-cert-extra-sans=$MASTER_IP
mkdir -p $HOME/.kube
sudo rm -f $HOME/.kube/config
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
sudo chown $(id -u):$(id -g) /etc/kubernetes/kubelet.conf
wget https://docs.projectcalico.org/v3.8/manifests/calico.yaml -O calico.yaml
sudo sed -i "s/192.168.0.0\/16/$CALICO_IPV4POOL_CIDR\/12/g" calico.yaml
sudo sed -i "s/192.168.0.0/$CALICO_IPV4POOL_CIDR/g" calico.yaml
kubectl apply -f calico.yaml
Now we test if all calico pods are running:
kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-bcc6f659f-ns7kz 1/1 Running 0 15m
kube-system calico-node-htvdv 1/1 Running 6 15m
kube-system coredns-74ff55c5b-lqwpd 1/1 Running 0 17m
kube-system coredns-74ff55c5b-qzc87 1/1 Running 0 17m
kube-system etcd-got-virtualbox 1/1 Running 0 17m
kube-system kube-apiserver-got-virtualbox 1/1 Running 0 17m
kube-system kube-controller-manager-got-virtualbox 1/1 Running 0 18m
kube-system kube-proxy-6xr5j 1/1 Running 0 17m
kube-system kube-scheduler-got-virtualbox 1/1 Running 0 17m
Looks good. If not check CALICO_IPV4POOL_CIDR by editing the node config: KUBE_EDITOR="nano" kubectl edit -n kube-system ds calico-node
Let's apply the kubernetes-dashboard and start the proxy:
export KUBECONFIG=$HOME/.kube/config
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0/aio/deploy/recommended.yaml
kubectl proxy
Now I can load http://127.0.0.1:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/
if you are using helm,
check if kubectl proxy is running
then goto
http://localhost:8001/api/v1/namespaces/default/services/https:kubernetes-dashboard:https/proxy
two tips in above link:
use helm to install, the namespaces will be /default (not /kubernetes-dashboard
need add https after /https:kubernetes-dashboard:
better way is
helm delete kubernetes-dashboard
kubectl create namespace kubernetes-dashboard
helm install -n kubernetes-dashboard kubernetes-dashboard kubernetes-dashboard/kubernetes-dashboard
then goto
http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:https/proxy
then you can easily follow creating-sample-user to get token to login
i was facing the same issue, so i followed the official docs and then went to https://github.com/kubernetes/dashboard url, there is another way using helm on this link https://artifacthub.io/packages/helm/k8s-dashboard/kubernetes-dashboard
after installing helm and run this 2 commands
helm repo add kubernetes-dashboard https://kubernetes.github.io/dashboard/
helm install kubernetes-dashboard kubernetes-dashboard/kubernetes-dashboard
it worked but on default namespace on this link
http://localhost:8001/api/v1/namespaces/default/services/https:kubernetes-dashboard:https/proxy/#/workloads?namespace=default