Metrics server is currently unable to handle the request - kubernetes

I am new to kubernetes and was trying to apply horizontal pod autoscaling to my existing application. and after following other stackoverflow details - got to know that I need to install metric-server - and I was able to - but some how it's not working and unable to handle request.
Further I followed few more things but unable to resolve the issue - I will really appreciate any help here.
Please let me know for any further details you need for helping me :) Thanks in advance.
Steps followed:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
serviceaccount/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
service/metrics-server created
deployment.apps/metrics-server created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
kubectl get deploy,svc -n kube-system | egrep metrics-server
deployment.apps/metrics-server 1/1 1 1 2m6s
service/metrics-server ClusterIP 10.32.0.32 <none> 443/TCP 2m6s
kubectl get pods -n kube-system | grep metrics-server
metrics-server-64cf6869bd-6gx88 1/1 Running 0 2m39s
vi ana_hpa.yaml
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: ana-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: StatefulSet
name: common-services-auth
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 160
k apply -f ana_hpa.yaml
horizontalpodautoscaler.autoscaling/ana-hpa created
k get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
ana-hpa StatefulSet/common-services-auth <unknown>/160%, <unknown>/80% 1 10 0 4s
k describe hpa ana-hpa
Name: ana-hpa
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Tue, 12 Apr 2022 17:01:25 +0530
Reference: StatefulSet/common-services-auth
Metrics: ( current / target )
resource memory on pods (as a percentage of request): <unknown> / 160%
resource cpu on pods (as a percentage of request): <unknown> / 80%
Min replicas: 1
Max replicas: 10
StatefulSet pods: 3 current / 0 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededGetScale the HPA controller was able to get the target's current scale
ScalingActive False FailedGetResourceMetric the HPA was unable to compute the replica count: failed to get memory utilization: unable to get metrics for resource memory: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetResourceMetric 38s (x8 over 2m23s) horizontal-pod-autoscaler failed to get cpu utilization: unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
Warning FailedComputeMetricsReplicas 38s (x8 over 2m23s) horizontal-pod-autoscaler invalid metrics (2 invalid out of 2), first error is: failed to get memory utilization: unable to get metrics for resource memory: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
Warning FailedGetResourceMetric 23s (x9 over 2m23s) horizontal-pod-autoscaler failed to get memory utilization: unable to get metrics for resource memory: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
kubectl get --raw /apis/metrics.k8s.io/v1beta1
Error from server (ServiceUnavailable): the server is currently unable to handle the request
kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes"
Error from server (ServiceUnavailable): the server is currently unable to handle the request
kubectl edit deployments.apps -n kube-system metrics-server
Add hostNetwork: true
deployment.apps/metrics-server edited
kubectl get pods -n kube-system | grep metrics-server
metrics-server-5dc6dbdb8-42hw9 1/1 Running 0 10m
k describe pod metrics-server-5dc6dbdb8-42hw9 -n kube-system
Name: metrics-server-5dc6dbdb8-42hw9
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Node: pusntyn196.apac.avaya.com/10.133.85.196
Start Time: Tue, 12 Apr 2022 17:08:25 +0530
Labels: k8s-app=metrics-server
pod-template-hash=5dc6dbdb8
Annotations: <none>
Status: Running
IP: 10.133.85.196
IPs:
IP: 10.133.85.196
Controlled By: ReplicaSet/metrics-server-5dc6dbdb8
Containers:
metrics-server:
Container ID: containerd://024afb1998dce4c0bd5f4e58f996068ea37982bd501b54fda2ef8d5c1098b4f4
Image: k8s.gcr.io/metrics-server/metrics-server:v0.6.1
Image ID: k8s.gcr.io/metrics-server/metrics-server#sha256:5ddc6458eb95f5c70bd13fdab90cbd7d6ad1066e5b528ad1dcb28b76c5fb2f00
Port: 4443/TCP
Host Port: 4443/TCP
Args:
--cert-dir=/tmp
--secure-port=4443
--kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
--kubelet-use-node-status-port
--metric-resolution=15s
State: Running
Started: Tue, 12 Apr 2022 17:08:26 +0530
Ready: True
Restart Count: 0
Requests:
cpu: 100m
memory: 200Mi
Liveness: http-get https://:https/livez delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get https://:https/readyz delay=20s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/tmp from tmp-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-g6p4g (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
tmp-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
kube-api-access-g6p4g:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 2s
node.kubernetes.io/unreachable:NoExecute op=Exists for 2s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m31s default-scheduler Successfully assigned kube-system/metrics-server-5dc6dbdb8-42hw9 to pusntyn196.apac.avaya.com
Normal Pulled 2m32s kubelet Container image "k8s.gcr.io/metrics-server/metrics-server:v0.6.1" already present on machine
Normal Created 2m31s kubelet Created container metrics-server
Normal Started 2m31s kubelet Started container metrics-server
kubectl get --raw /apis/metrics.k8s.io/v1beta1
Error from server (ServiceUnavailable): the server is currently unable to handle the request
kubectl get pods -n kube-system | grep metrics-server
metrics-server-5dc6dbdb8-42hw9 1/1 Running 0 10m
kubectl logs -f metrics-server-5dc6dbdb8-42hw9 -n kube-system
E0412 11:43:54.684784 1 configmap_cafile_content.go:242] kube-system/extension-apiserver-authentication failed with : missing content for CA bundle "client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
E0412 11:44:27.001010 1 configmap_cafile_content.go:242] key failed with : missing content for CA bundle "client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
k logs -f metrics-server-5dc6dbdb8-42hw9 -n kube-system
I0412 11:38:26.447305 1 serving.go:342] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
I0412 11:38:26.899459 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0412 11:38:26.899477 1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I0412 11:38:26.899518 1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
I0412 11:38:26.899545 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0412 11:38:26.899546 1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
I0412 11:38:26.899567 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0412 11:38:26.900480 1 dynamic_serving_content.go:131] "Starting controller" name="serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key"
I0412 11:38:26.900811 1 secure_serving.go:266] Serving securely on [::]:4443
I0412 11:38:26.900854 1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
W0412 11:38:26.900965 1 shared_informer.go:372] The sharedIndexInformer has started, run more than once is not allowed
I0412 11:38:26.999960 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0412 11:38:26.999989 1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController
I0412 11:38:26.999970 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
E0412 11:38:27.000087 1 configmap_cafile_content.go:242] kube-system/extension-apiserver-authentication failed with : missing content for CA bundle "client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
E0412 11:38:27.000118 1 configmap_cafile_content.go:242] key failed with : missing content for CA bundle "client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
kubectl top nodes
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)
kubectl top pods
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods.metrics.k8s.io)
Edit metrics server deployment yaml
Add - --kubelet-insecure-tls
k apply -f metric-server-deployment.yaml
serviceaccount/metrics-server unchanged
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader unchanged
clusterrole.rbac.authorization.k8s.io/system:metrics-server unchanged
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader unchanged
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator unchanged
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server unchanged
service/metrics-server unchanged
deployment.apps/metrics-server configured
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io unchanged
kubectl get pods -n kube-system | grep metrics-server
metrics-server-5dc6dbdb8-42hw9 1/1 Running 0 10m
kubectl top pods
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods.metrics.k8s.io)
Also tried by adding below to metrics server deployment
command:
- /metrics-server
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP

This can easily be resolved by editing the deployment yaml files and adding the hostNetwork: true after the dnsPolicy: ClusterFirst
kubectl edit deployments.apps -n kube-system metrics-server
insert:
hostNetwork: true

I hope this help somebody for bare metal cluster:
$ helm --repo https://kubernetes-sigs.github.io/metrics-server/ --kubeconfig=$HOME/.kube/loc-cluster.config -n kube-system --set args='{--kubelet-insecure-tls}' upgrade --install metrics-server metrics-server
$ helm --kubeconfig=$HOME/.kube/loc-cluster.config -n kube-system uninstall metrics-server

Update: I deployed the metrics-server using the same command. Perhaps you can start fresh by removing existing resources and running:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
=======================================================================
It appears the --kubelet-insecure-tls flag was not configured correctly for the pod template in the deployment. The following should fix this:
Edit the existing deployment in the cluster with kubectl edit deployment/metrics-server -nkube-system.
Add the flag to the spec.containers[].args list, so that the deployment looks like this:
...
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
- --kubelet-insecure-tls <=======ADD IT HERE.
image: k8s.gcr.io/metrics-server/metrics-server:v0.6.1
...
Simply save your changes and let the deployment rollout the updated pods. You can use watch -n1 kubectl get deployment/kube-metrics -nkube-system and wait for UP-TO-DATE column to show 1.
Like this:
NAME READY UP-TO-DATE AVAILABLE AGE
metrics-server 1/1 1 1 16m
Verify with kubectl top nodes. It will show something like
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
docker-desktop 222m 5% 1600Mi 41%
I've just verified this to work on a local setup. Let me know if this helps :)

Please configuration aggregation layer correctly and carefully, you can use this link for help : https://kubernetes.io/docs/tasks/extend-kubernetes/configure-aggregation-layer/.
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
name: <name of the registration object>
spec:
group: <API group name this extension apiserver hosts>
version: <API version this extension apiserver hosts>
groupPriorityMinimum: <priority this APIService for this group, see API documentation>
versionPriority: <prioritizes ordering of this version within a group, see API documentation>
service:
namespace: <namespace of the extension apiserver service>
name: <name of the extension apiserver service>
caBundle: <pem encoded ca cert that signs the server cert used by the webhook>
It would be helpful to provide kubectl version return value.

For me on EKS with helmfile I had to write in the values.yaml using the metrics-server chart :
containerPort: 10250
The value was enforced by default to 4443 for an unknown reason when I first deployed the chart.
See doc:
https://github.com/kubernetes-sigs/metrics-server/blob/master/charts/metrics-server/values.yaml#L62
https://aws.amazon.com/premiumsupport/knowledge-center/eks-metrics-server/#:~:text=confirm%20that%20your%20security%20groups
Then kubectl top nodes and kubectl describe apiservice v1beta1.metrics.k8s.io were working.

First of all, execute the following command:
kubectl get apiservices
And checkout the availablity (status) of kube-system/metrics-server service.
In case the availability is True:
Add hostNetwork: true to the spec of your metrics-server deployment by executing the following command:
kubectl edit deployment -n kube-system metrics-server
It should look like the following:
...
spec:
hostNetwork: true
...
Setting hostNetwork to true means that Pod will have access to
the host where it's running.
In case the availability is False (MissingEndpoints):
Download metrics-server:
wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.5.0/components.yaml
Remove (legacy) metrics server:
kubectl delete -f components.yaml
Edit downloaded file and add - --kubelet-insecure-tls to args list:
...
labels:
k8s-app: metrics-server
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
- --kubelet-insecure-tls # add this line
...
Create service once again:
kubectl apply -f components.yaml

Related

metrics-service in kubernetes not working

I'm running kubernetes using an ec2 machine on aws.
Node is in Ubuntu.
my metrics-server version.
wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.7/components.yaml
components.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: metrics-server
namespace: kube-system
labels:
k8s-app: metrics-server
spec:
serviceAccountName: metrics-server
volumes:
# mount in tmp so we can safely use from-scratch images and/or read-only containers
- name: tmp-dir
emptyDir: {}
containers:
- name: metrics-server
image: k8s.gcr.io/metrics-server/metrics-server:v0.3.7
imagePullPolicy: IfNotPresent
args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-type=InternalIP,ExternalIP,Hostname
- --kubelet-insecure-tls
Even after adding args, the error appears.
error :
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)
or
error: metrics not available yet
No matter how long I wait, that error appears.
my kops version : Version 1.18.0 (git-698bf974d8)
i use networking calico.
please help...
++
I try to wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.5.0/components.yaml
view logs..
kubectl logs -n kube-system deploy/metrics-server
"Failed to scrape node" err="GET "https://172.20.51.226:10250/stats/summary?only_cpu_and_memory=true": bad status code "401 Unauthorized"" node="ip-172-20-51-226.ap-northeast-2.compute.internal"
"Failed probe" probe="metric-storage-ready" err="not metrics to serve"
Download the components.yaml file manually:
wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Then edit the args section under Deployment:
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
add there two more lines:
- --kubelet-insecure-tls=true
- --kubelet-preferred-address-types=InternalIP
kubelet Of 10250 The port uses https agreement , The connection needs to be verified by tls certificate. Adding ,--kubelet-insecure-tls tells it do not verify client certificate.
After this modification just apply the manifest:
kubectl apply -f components.yaml
wait a minute and you will see metrics server pod is up
Last comment is useful.You can edit the deploy directly as well and adding line "--kubelet-insecure-tls=true" its enought for me:
Edit deploy:
$ kubectl edit deployment.apps/metrics-server -n kube-system
Add the line:
- --kubelet-insecure-tls=true
Similar result:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
- --kubelet-insecure-tls=true
And save with ":wq" and enjoy.
~$ kubectl top pods -n kube-system
NAME CPU(cores) MEMORY(bytes)
coredns-6d4b75cb6d-k8dmc 3m 18Mi
coredns-6d4b75cb6d-wxxn6 3m 17Mi
kube-apiserver-k8s-master1 82m 306Mi
kube-apiserver-k8s-master2 65m 247Mi
kube-controller-manager-k8s-master1 32m 47Mi
kube-controller-manager-k8s-master2 4m 19Mi
kube-proxy-9dbgk 1m 9Mi
kube-proxy-bwhdm 1m 14Mi
kube-proxy-fz8v8 1m 15Mi
kube-proxy-vcnrc 1m 9Mi
kube-scheduler-k8s-master1 7m 18Mi
kube-scheduler-k8s-master2 4m 16Mi
metrics-server-79576f7ff-97tpc 6m 15Mi
metrics-server-79576f7ff-qzczp 4m 13Mi
~$ kubectl top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
k8s-master1 318m 15% 1047Mi 55%
k8s-master2 208m 10% 1002Mi 52%
k8s-worker1 30m 3% 804Mi 42%
k8s-worker2 35m 3% 550Mi 29%

Kubernetes pull from local registry not working, how to fix that?

I followed this link and created a secret as below.
kubectl create secret docker-registry regsec --docker-server=192.168.56.106:5000 --docker-username=osboxes --docker-password=osboxes.org --insecure-skip-tls-verify=true
And the deployment as below.
kubectl create deploy nginx1 --image 192.168.56.106:5000/todoapp:1.0
And edited it using
kubectl edit deploy nginx1
And added imagePullSecrets to it
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "2"
creationTimestamp: "2021-07-21T10:23:23Z"
generation: 2
labels:
app: nginx1
name: nginx1
namespace: default
resourceVersion: "6872"
uid: 0b6917f0-10ac-4206-82a8-c49ae8ffa2b3
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: nginx1
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app: nginx1
spec:
containers:
- image: 192.168.56.106:5000/todoapp:1.0
imagePullPolicy: IfNotPresent
name: todoapp
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
imagePullSecrets:
- name: regsec
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
status:
conditions:
- lastTransitionTime: "2021-07-21T10:23:23Z"
lastUpdateTime: "2021-07-21T10:23:23Z"
message: Deployment does not have minimum availability.
reason: MinimumReplicasUnavailable
status: "False"
type: Available
- lastTransitionTime: "2021-07-21T10:23:23Z"
lastUpdateTime: "2021-07-21T10:28:36Z"
message: ReplicaSet "nginx1-75df7fd466" is progressing.
reason: ReplicaSetUpdated
status: "True"
type: Progressing
observedGeneration: 2
replicas: 2
unavailableReplicas: 2
updatedReplicas: 1
But still getting below error.
osboxes#osboxes:~/Desktop$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-6799fc88d8-vvzjf 1/1 Running 0 19m
nginx1-65d848d94f-dd4ck 0/1 ImagePullBackOff 0 12m
nginx1-75df7fd466-kn5mf 0/1 ImagePullBackOff 0 6m50s
osboxes#osboxes:~/Desktop$ kubectl describe pod nginx1-75df7fd466-kn5mf
Name: nginx1-75df7fd466-kn5mf
Namespace: default
Priority: 0
Node: samples-control-plane/172.19.0.3
Start Time: Wed, 21 Jul 2021 06:28:36 -0400
Labels: app=nginx1
pod-template-hash=75df7fd466
Annotations: <none>
Status: Pending
IP: 10.244.0.8
IPs:
IP: 10.244.0.8
Controlled By: ReplicaSet/nginx1-75df7fd466
Containers:
todoapp:
Container ID:
Image: 192.168.56.106:5000/todoapp:1.0
Image ID:
Port: <none>
Host Port: <none>
State: Waiting
Reason: ImagePullBackOff
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-88clq (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-88clq:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 73s default-scheduler Successfully assigned default/nginx1-75df7fd466-kn5mf to samples-control-plane
Normal Pulling 33s (x3 over 73s) kubelet Pulling image "192.168.56.106:5000/todoapp:1.0"
Warning Failed 33s (x3 over 73s) kubelet Failed to pull image "192.168.56.106:5000/todoapp:1.0": rpc error: code = Unknown desc = failed to pull and unpack image "192.168.56.106:5000/todoapp:1.0": failed to resolve reference "192.168.56.106:5000/todoapp:1.0": failed to do request: Head "https://192.168.56.106:5000/v2/todoapp/manifests/1.0": http: server gave HTTP response to HTTPS client
Warning Failed 33s (x3 over 73s) kubelet Error: ErrImagePull
Normal BackOff 6s (x4 over 72s) kubelet Back-off pulling image "192.168.56.106:5000/todoapp:1.0"
Warning Failed 6s (x4 over 72s) kubelet Error: ImagePullBackOff
kubectl version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.1", GitCommit:"5e58841cce77d4bc13713ad2b91fa0d961e69192", GitTreeState:"clean", BuildDate:"2021-05-13T02:40:46Z", GoVersion:"go1.16.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.1", GitCommit:"5e58841cce77d4bc13713ad2b91fa0d961e69192", GitTreeState:"clean", BuildDate:"2021-05-21T23:01:33Z", GoVersion:"go1.16.4", Compiler:"gc", Platform:"linux/amd64"}
Adding docker insecure registry is not useful as the container runtime is not docker now.It is containerd.
/etc/default/docker.json
And for reference, I have setup the local container on a virtualbox VM and connecting to kubernetes on the same network using Host-only network ip address.
The setup of local registry is created using the steps mentioned in link.
And the kubernetes was setup using kind.
cri-o
If you are using cri-o as Container Runtime Provider, docker settings are of no use. You would require to configure cri-o instead: https://github.com/cri-o/cri-o#configuration
Assuming you have cri-o installed on the host node, the official documentation recommends to have a $HOME/.config/containers/registries.conf file, or a global, /etc/containers/registries.conf file to configure registries in a runtime engine agnostic way.
The file structure spec is also documented: https://github.com/containers/image/blob/main/docs/containers-registries.conf.5.md
Example:
[registries.search]
registries = ['registry1.com', 'registry2.com']
[registries.insecure]
registries = ['registry3.com']
[registries.block]
registries = ['registry.untrusted.com', 'registry.unsafe.com']
containerd
Containerd does not seem to acknowledge the /etc/containers/registries.conf settings.
As per the documentation, The main config file /etc/containerd/config.toml can specify a registry config path as:
[plugins."io.containerd.grpc.v1.cri".registry]
config_path = "/etc/containerd/certs.d"
And inside the path specified above (/etc/containerd/certs.d) create a directory for host (docker.io) and in it, create a hosts.toml file. That is /etc/containerd/certs.d/docker.io/hosts.toml:
server = "https://registry-1.docker.io"
[host."http://my-custom-registry:5000"]
capabilities = ["pull", "resolve", "push"]
skip_verify = true
plain-http = true
For my-custom-registry:5000 section you can also provide credentials as well as certificates.
And restart containerd daemon/service on the host.
Configuration file spec is here: https://github.com/containerd/containerd/blob/main/docs/hosts.md

Dashboard not running

I have setup kubenertes on ubuntu server using this link.
Then I installed kubernetes dashboard using:
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-rc6/aio/deploy/recommended.yaml
Then I changed the ClusterIP to NodePort 32323, the service to NodePort.
But container is not running.
uday#dockermaster:~$ kubectl -n kubernetes-dashboard get all
NAME READY STATUS RESTARTS AGE
pod/dashboard-metrics-scraper-779f5454cb-pqfrj 1/1 Running 0 50m
pod/kubernetes-dashboard-64686c4bf9-5jkwq 0/1 CrashLoopBackOff 14 50m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/dashboard-metrics-scraper ClusterIP 10.103.22.252 <none> 8000/TCP 50m
service/kubernetes-dashboard NodePort 10.102.48.80 <none> 443:32323/TCP 50m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/dashboard-metrics-scraper 1/1 1 1 50m
deployment.apps/kubernetes-dashboard 0/1 1 0 50m
NAME DESIRED CURRENT READY AGE
replicaset.apps/dashboard-metrics-scraper-779f5454cb 1 1 1 50m
replicaset.apps/kubernetes-dashboard-64686c4bf9 1 1 0 50m
uday#dockermaster:~$ kubectl -n kubernetes-dashboard describe svc kubernetes-dashboard
Name: kubernetes-dashboard
Namespace: kubernetes-dashboard
Labels: k8s-app=kubernetes-dashboard
Annotations: Selector: k8s-app=kubernetes-dashboard
Type: NodePort
IP: 10.102.48.80
Port: <unset> 443/TCP
TargetPort: 8443/TCP
NodePort: <unset> 32323/TCP
Endpoints:
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>
Other apps are working fine with NodePort, whether Tomcat/nginx/databases.
But here, it is failing with container creation.
C:\Users\uday\Desktop>kubectl.exe get pods -n kubernetes-dashboard
NAME READY STATUS RESTARTS AGE
dashboard-metrics-scraper-779f5454cb-pqfrj 1/1 Running 1 20h
kubernetes-dashboard-64686c4bf9-g9z2k 0/1 CrashLoopBackOff 84 18h
C:\Users\uday\Desktop>kubectl.exe describe pod kubernetes-dashboard-64686c4bf9-g9z2k -n kubernetes-dashboard
Name: kubernetes-dashboard-64686c4bf9-g9z2k
Namespace: kubernetes-dashboard
Priority: 0
Node: slave-node/10.0.0.6
Start Time: Sat, 28 Mar 2020 14:16:54 +0000
Labels: k8s-app=kubernetes-dashboard
pod-template-hash=64686c4bf9
Annotations: <none>
Status: Running
IP: 182.244.1.12
IPs:
IP: 182.244.1.12
Controlled By: ReplicaSet/kubernetes-dashboard-64686c4bf9
Containers:
kubernetes-dashboard:
Container ID: docker://470ee8c61998c3c3dda86c58ad17817468f55aa73cd4feecf3b018977ce13ca3
Image: kubernetesui/dashboard:v2.0.0-rc6
Image ID: docker-pullable://kubernetesui/dashboard#sha256:61f9c378c427a3f8a9643f83baa9f96db1ae1357c67a93b533ae7b36d71c69dc
Port: 8443/TCP
Host Port: 0/TCP
Args:
--auto-generate-certificates
--namespace=kubernetes-dashboard
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 2
Started: Sun, 29 Mar 2020 09:01:31 +0000
Finished: Sun, 29 Mar 2020 09:02:01 +0000
Ready: False
Restart Count: 84
Liveness: http-get https://:8443/ delay=30s timeout=30s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/certs from kubernetes-dashboard-certs (rw)
/tmp from tmp-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kubernetes-dashboard-token-pzfbl (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kubernetes-dashboard-certs:
Type: Secret (a volume populated by a Secret)
SecretName: kubernetes-dashboard-certs
Optional: false
tmp-volume:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
kubernetes-dashboard-token-pzfbl:
Type: Secret (a volume populated by a Secret)
SecretName: kubernetes-dashboard-token-pzfbl
Optional: false
QoS Class: BestEffort
Node-Selectors: beta.kubernetes.io/os=linux
Tolerations: node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning BackOff 4m49s (x501 over 123m) kubelet, slave-node Back-off restarting failed container
kubectl.exe logs kubernetes-dashboard-64686c4bf9-g9z2k -n kubernetes-dashboard
2020/03/29 09:01:31 Starting overwatch
2020/03/29 09:01:31 Using namespace: kubernetes-dashboard
2020/03/29 09:01:31 Using in-cluster config to connect to apiserver
2020/03/29 09:01:31 Using secret token for csrf signing
2020/03/29 09:01:31 Initializing csrf token from kubernetes-dashboard-csrf secret
panic: Get https://10.96.0.1:443/api/v1/namespaces/kubernetes-dashboard/secrets/kubernetes-dashboard-csrf: dial tcp 10.96.0.1:443: i/o timeout
goroutine 1 [running]:
github.com/kubernetes/dashboard/src/app/backend/client/csrf.(*csrfTokenManager).init(0xc0004e2dc0)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/csrf/manager.go:40 +0x3b0
github.com/kubernetes/dashboard/src/app/backend/client/csrf.NewCsrfTokenManager(...)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/csrf/manager.go:65
github.com/kubernetes/dashboard/src/app/backend/client.(*clientManager).initCSRFKey(0xc00043ae80)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/manager.go:499 +0xc6
github.com/kubernetes/dashboard/src/app/backend/client.(*clientManager).init(0xc00043ae80)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/manager.go:467 +0x47
github.com/kubernetes/dashboard/src/app/backend/client.NewClientManager(...)
/home/travis/build/kubernetes/dashboard/src/app/backend/client/manager.go:548
main.main()
/home/travis/build/kubernetes/dashboard/src/app/backend/dashboard.go:105 +0x20d
The Problem
The reason the application is not coming up is because the Dashboard container itself is not running. If you look at the output you provided you can see this:
pod/kubernetes-dashboard-64686c4bf9-5jkwq 0/1 CrashLoopBackOff 14
So how do we troubleshoot this? Well, there are three principle ways. One of which you'll probably be using more than the other two.
Describe
Describe is a is a command that allows you to fetch details about a resource in kubernetes. This could be metadata information, the amount of replicas you assigned, or even some events depicting why a resource is failing to start. For example, a referenced Container Image in your Pod manifest can not be found in the usable container registries. The syntax for using Describe is like so:
kubectl describe pod -n kubernetes-dashboard kubernetes-dashboard-64686c4bf9-5jkwq
Here are some great docs on the tool as well.
Logs
The next troubleshooting step you'll likely take advantage of in Kubernetes is using the logging architecture. As you're probably aware, when a Docker container is spawned it is common practice to have the logs that are produced by the application be redirected to STDOUT or STDERR for the process. Kubernetes them captures this log data for you and provides an API abstraction layer with which you can interact with it. Sometimes your Describe events won't have any indication of why a process isn't running. However, you can then proceed with grabbing logs from the process to determine what is going wrong. An example syntax might look like:
kubectl logs -f -n kubernetes-dashboard kubernetes-dashboard-64686c4bf9-5jkwq
Exec
The last common troubleshooting technique is Exec. Exec effectively allows you to attach to the a shell in a running container so that you can interact with the live environment to troubleshoot the application. This allows you to do things like see if configuration files were properly staged on the container's filesystem, determine if environment variables were properly expanded and set, etc. An example syntax for Exec might look like:
kubectl exec -it -n kubernetes-dashboard kubernetes-dashboard-64686c4bf9-5jkwq sh
In this case, however, your pod is in a CrashLoopBackoff state. This means that you will not be able to exec into it due to the fact that the container is not running. The Kubernetes API Server recognized a pattern of failures and automatically reduced its attempts at scheduling the workload accordingly.
Here is a good thread on how to troubleshoot pods that enter this state.
Summary
So, now that I've said all of this. How do we answer your question? Well, we can't answer it directly. But I sort of did with my summary above. Because the real answer you're looking for is how to properly troubleshoot linux containers running in Kubernetes. These issues will be a reoccurring theme in your experience with Kubernetes so it's essential to develop debugging skills in the ecosystem as soon as possible.
If the Describe, Logs, and Exec command are unable to help you find out why the Dashboard pod is failing to come up, feel free to add a comment on this answer requesting additional support and I'll be happy to help where I can!

how to scale daemon set about kubernetes using kubectl

Now I only have terminal to access kubernetes cluster now, check the ingress controller like this:
$ k get daemonset --all-namespaces
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-system traefik-ingress-controller 0 0 0 0 0 IngressProxy=true 60d
logging fluentd-es 0 0 0 0 0 beta.kubernetes.io/fluentd-ds-ready=true 28d
I am now using kubectl(v1.15.2) to scale daemon set like this:
kubectl scale --replicas=1 DaemonSet/traefik-ingress-controller -n kube-system
but it shows:
Error from server (NotFound): the server could not find the requested resource
what should I do to start the traefik in terminal using command line? This is my daemon set describe output:
~/Library/Mobile Documents/com~apple~CloudDocs/Document/k8s/work/traefik-deployment-yaml/k8s-backup ⌚ 17:49:58
$ k describe daemonset traefik-ingress-controller -n kube-system
Name: traefik-ingress-controller
Selector: app=traefik
Node-Selector: IngressProxy=true
Labels: app=traefik
Annotations: deprecated.daemonset.template.generation: 18
kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"apps/v1","kind":"DaemonSet","metadata":{"annotations":{},"labels":{"app":"traefik"},"name":"traefik-ingress-controller","na...
Desired Number of Nodes Scheduled: 0
Current Number of Nodes Scheduled: 0
Number of Nodes Scheduled with Up-to-date Pods: 0
Number of Nodes Scheduled with Available Pods: 0
Number of Nodes Misscheduled: 0
Pods Status: 0 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: app=traefik
Service Account: traefik-ingress-controller
Containers:
traefik-ingress-lb:
Image: traefik:v2.1.6
Ports: 80/TCP, 443/TCP, 8080/TCP
Host Ports: 80/TCP, 443/TCP, 0/TCP
Args:
--configfile=/config/traefik.yaml
--logLevel=INFO
--metrics=true
--metrics.prometheus=true
--entryPoints.metrics.address=:8080
--metrics.prometheus.entryPoint=metrics
--metrics.prometheus.addServicesLabels=true
--metrics.prometheus.addEntryPointsLabels=true
--metrics.prometheus.buckets=0.100000, 0.300000, 1.200000, 5.000000
Limits:
cpu: 2
memory: 1Gi
Requests:
cpu: 1
memory: 1Gi
Environment: <none>
Mounts:
/config from config (rw)
Volumes:
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: traefik-config
Optional: false
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedDaemonPod 3h32m daemonset-controller Found failed daemon pod kube-system/traefik-ingress-controller-wdpsq on node azshara-k8s03, will try to kill it
Normal SuccessfulDelete 3h32m daemonset-controller Deleted pod: traefik-ingress-controller-wdpsq
Normal SuccessfulCreate 3h32m daemonset-controller Created pod: traefik-ingress-controller-qmttl
Warning FailedDaemonPod 3h32m daemonset-controller Found failed daemon pod kube-system/traefik-ingress-controller-qmttl on node azshara-k8s03, will try to kill it
Normal SuccessfulDelete 3h32m daemonset-controller Deleted pod: traefik-ingress-controller-qmttl
Normal SuccessfulCreate 3h32m daemonset-controller Created pod: traefik-ingress-controller-nlxwc
You don not need to scale a deamon set on K8s.
A Daemon Set ensures that all eligible nodes run a copy of a Pod..
As nodes are added to the cluster, Pods are added to them. So you need to add new node to cluster and deamon set will be scheduled there unless you have a very unique taint to disallow given deamon set.

No nodes available to schedule pods, using google container engine

I'm having an issue where a container I'd like to run doesn't appear to be getting started on my cluster.
I've tried searching around for possible solutions, but there's a surprising lack of information out there to assist with this issue or anything of it's nature.
Here's the most I could gather:
$ kubectl describe pods/elasticsearch
Name: elasticsearch
Namespace: default
Image(s): my.image.host/my-project/elasticsearch
Node: /
Labels: <none>
Status: Pending
Reason:
Message:
IP:
Replication Controllers: <none>
Containers:
elasticsearch:
Image: my.image.host/my-project/elasticsearch
Limits:
cpu: 100m
State: Waiting
Ready: False
Restart Count: 0
Events:
FirstSeen LastSeen Count From SubobjectPath Reason Message
Mon, 19 Oct 2015 10:28:44 -0500 Mon, 19 Oct 2015 10:34:09 -0500 12 {scheduler } failedScheduling no nodes available to schedule pods
I also see this:
$ kubectl get pod elasticsearch -o wide
NAME READY STATUS RESTARTS AGE NODE
elasticsearch 0/1 Pending 0 5s
I guess I'd like to know: What prerequisites exist so that I can be confident that my container is going to run in container engine? What do I need to do in this scenario to get it running?
Here's my yml file:
apiVersion: v1
kind: Pod
metadata:
name: elasticsearch
spec:
containers:
- name: elasticsearch
image: my.image.host/my-project/elasticsearch
ports:
- containerPort: 9200
resources:
volumeMounts:
- name: elasticsearch-data
mountPath: /usr/share/elasticsearch
volumes:
- name: elasticsearch-data
gcePersistentDisk:
pdName: elasticsearch-staging
fsType: ext4
Here's some more output about my node:
$ kubectl get nodes
NAME LABELS STATUS
gke-elasticsearch-staging-00000000-node-yma3 kubernetes.io/hostname=gke-elasticsearch-staging-00000000-node-yma3 NotReady
You only have one node in your cluster and its status in NotReady. So you won't be able to schedule any pods. You can try to determine why your node isn't ready by looking in /var/log/kubelet.log. You can also add new nodes to your cluster (scale the cluster size up to 2) or delete the node (it will be automatically replaced by the instance group manager) to see if either of those options get you a working node.
It appears that scheduler couldn't see any nodes in your cluster. You can run kubectl get nodes and gcloud compute instances list to confirm whether you have any nodes in the cluster. Did you correctly specify number of nodes (--num-nodes) when creating the cluster?