Kubernetes without pod metrics - kubernetes

I´m trying to deploy metrics to kubernetes and something really strange is happening, I have one worker and one master. I have the following pods list:
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default php-apache-774ff9d754-d7vp9 1/1 Running 0 2m43s 192.168.77.172 master-node <none> <none>
kube-system calico-kube-controllers-6b9d4c8765-x7pql 1/1 Running 2 4h11m 192.168.77.130 master-node <none> <none>
kube-system calico-node-d4rnh 0/1 Running 1 4h11m 10.221.194.166 master-node <none> <none>
kube-system calico-node-hwkmd 0/1 Running 1 4h11m 10.221.195.58 free5gc-virtual-machine <none> <none>
kube-system coredns-6955765f44-kf4dr 1/1 Running 1 4h20m 192.168.178.65 free5gc-virtual-machine <none> <none>
kube-system coredns-6955765f44-s58rf 1/1 Running 1 4h20m 192.168.178.66 free5gc-virtual-machine <none> <none>
kube-system etcd-free5gc-virtual-machine 1/1 Running 1 4h21m 10.221.195.58 free5gc-virtual-machine <none> <none>
kube-system kube-apiserver-free5gc-virtual-machine 1/1 Running 1 4h21m 10.221.195.58 free5gc-virtual-machine <none> <none>
kube-system kube-controller-manager-free5gc-virtual-machine 1/1 Running 1 4h21m 10.221.195.58 free5gc-virtual-machine <none> <none>
kube-system kube-proxy-brvdg 1/1 Running 1 4h19m 10.221.194.166 master-node <none> <none>
kube-system kube-proxy-lfzjw 1/1 Running 1 4h20m 10.221.195.58 free5gc-virtual-machine <none> <none>
kube-system kube-scheduler-free5gc-virtual-machine 1/1 Running 1 4h21m 10.221.195.58 free5gc-virtual-machine <none> <none>
kube-system metrics-server-86c6d8b9bf-p2hh8 1/1 Running 0 2m43s 192.168.77.171 master-node <none> <none>
When I try to get the metrics I see the following:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
php-apache Deployment/php-apache <unknown>/50% 1 10 1 3m58s
free5gc#free5gc-virtual-machine:~/Desktop/metrics-server/deploy$
free5gc#free5gc-virtual-machine:~/Desktop/metrics-server/deploy$
free5gc#free5gc-virtual-machine:~/Desktop/metrics-server/deploy$
free5gc#free5gc-virtual-machine:~/Desktop/metrics-server/deploy$ kubectl top nodes
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)
free5gc#free5gc-virtual-machine:~/Desktop/metrics-server/deploy$
free5gc#free5gc-virtual-machine:~/Desktop/metrics-server/deploy$
free5gc#free5gc-virtual-machine:~/Desktop/metrics-server/deploy$
free5gc#free5gc-virtual-machine:~/Desktop/metrics-server/deploy$ kubectl top pods --all-namespaces
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods.metrics.k8s.io)
Lastly, I see the log (v=6) the output of metrics-server:
free5gc#free5gc-virtual-machine:~/Desktop/metrics-server/deploy$ kubectl logs metrics-server-86c6d8b9bf-p2hh8 -n kube-system
I0206 18:16:18.657605 1 serving.go:273] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
I0206 18:16:19.367356 1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication 200 OK in 7 milliseconds
I0206 18:16:19.370573 1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication 200 OK in 1 milliseconds
I0206 18:16:19.373245 1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication 200 OK in 1 milliseconds
I0206 18:16:19.375024 1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication 200 OK in 1 milliseconds
[restful] 2020/02/06 18:16:19 log.go:33: [restful/swagger] listing is available at https://:4443/swaggerapi
[restful] 2020/02/06 18:16:19 log.go:33: [restful/swagger] https://:4443/swaggerui/ is mapped to folder /swagger-ui/
I0206 18:16:19.421207 1 healthz.go:83] Installing healthz checkers:"ping", "poststarthook/generic-apiserver-start-informers", "healthz"
I0206 18:16:19.421641 1 serve.go:96] Serving securely on [::]:4443
I0206 18:16:19.421873 1 reflector.go:202] Starting reflector *v1.Pod (0s) from github.com/kubernetes-incubator/metrics-server/vendor/k8s.io/client-go/informers/factory.go:130
I0206 18:16:19.421891 1 reflector.go:240] Listing and watching *v1.Pod from github.com/kubernetes-incubator/metrics-server/vendor/k8s.io/client-go/informers/factory.go:130
I0206 18:16:19.421914 1 reflector.go:202] Starting reflector *v1.Node (0s) from github.com/kubernetes-incubator/metrics-server/vendor/k8s.io/client-go/informers/factory.go:130
I0206 18:16:19.421929 1 reflector.go:240] Listing and watching *v1.Node from github.com/kubernetes-incubator/metrics-server/vendor/k8s.io/client-go/informers/factory.go:130
I0206 18:16:19.423052 1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/nodes?limit=500&resourceVersion=0 200 OK in 1 milliseconds
I0206 18:16:19.424261 1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/pods?limit=500&resourceVersion=0 200 OK in 2 milliseconds
I0206 18:16:19.425586 1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/nodes?resourceVersion=38924&timeoutSeconds=481&watch=true 200 OK in 0 milliseconds
I0206 18:16:19.433545 1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/pods?resourceVersion=39246&timeoutSeconds=582&watch=true 200 OK in 0 milliseconds
I0206 18:16:49.388514 1 manager.go:99] Beginning cycle, collecting metrics...
I0206 18:16:49.388598 1 manager.go:95] Scraping metrics from 2 sources
I0206 18:16:49.395742 1 manager.go:120] Querying source: kubelet_summary:free5gc-virtual-machine
I0206 18:16:49.400574 1 manager.go:120] Querying source: kubelet_summary:master-node
I0206 18:16:49.413751 1 round_trippers.go:405] GET https://10.221.194.166:10250/stats/summary/ 200 OK in 13 milliseconds
I0206 18:16:49.414317 1 round_trippers.go:405] GET https://10.221.195.58:10250/stats/summary/ 200 OK in 18 milliseconds
I0206 18:16:49.417044 1 manager.go:150] ScrapeMetrics: time: 28.428677ms, nodes: 2, pods: 13
I0206 18:16:49.417062 1 manager.go:115] ...Storing metrics...
I0206 18:16:49.417083 1 manager.go:126] ...Cycle complete
free5gc#free5gc-virtual-machine:~/Desktop/metrics-server/deploy$ kubectl logs metrics-server-86c6d8b9bf-p2hh8 -n kube-system
I0206 18:16:18.657605 1 serving.go:273] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
I0206 18:16:19.367356 1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication 200 OK in 7 milliseconds
I0206 18:16:19.370573 1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication 200 OK in 1 milliseconds
I0206 18:16:19.373245 1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication 200 OK in 1 milliseconds
I0206 18:16:19.375024 1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/namespaces/kube-system/configmaps/extension-apiserver-authentication 200 OK in 1 milliseconds
[restful] 2020/02/06 18:16:19 log.go:33: [restful/swagger] listing is available at https://:4443/swaggerapi
[restful] 2020/02/06 18:16:19 log.go:33: [restful/swagger] https://:4443/swaggerui/ is mapped to folder /swagger-ui/
I0206 18:16:19.421207 1 healthz.go:83] Installing healthz checkers:"ping", "poststarthook/generic-apiserver-start-informers", "healthz"
I0206 18:16:19.421641 1 serve.go:96] Serving securely on [::]:4443
I0206 18:16:19.421873 1 reflector.go:202] Starting reflector *v1.Pod (0s) from github.com/kubernetes-incubator/metrics-server/vendor/k8s.io/client-go/informers/factory.go:130
I0206 18:16:19.421891 1 reflector.go:240] Listing and watching *v1.Pod from github.com/kubernetes-incubator/metrics-server/vendor/k8s.io/client-go/informers/factory.go:130
I0206 18:16:19.421914 1 reflector.go:202] Starting reflector *v1.Node (0s) from github.com/kubernetes-incubator/metrics-server/vendor/k8s.io/client-go/informers/factory.go:130
I0206 18:16:19.421929 1 reflector.go:240] Listing and watching *v1.Node from github.com/kubernetes-incubator/metrics-server/vendor/k8s.io/client-go/informers/factory.go:130
I0206 18:16:19.423052 1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/nodes?limit=500&resourceVersion=0 200 OK in 1 milliseconds
I0206 18:16:19.424261 1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/pods?limit=500&resourceVersion=0 200 OK in 2 milliseconds
I0206 18:16:19.425586 1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/nodes?resourceVersion=38924&timeoutSeconds=481&watch=true 200 OK in 0 milliseconds
I0206 18:16:19.433545 1 round_trippers.go:405] GET https://10.96.0.1:443/api/v1/pods?resourceVersion=39246&timeoutSeconds=582&watch=true 200 OK in 0 milliseconds
I0206 18:16:49.388514 1 manager.go:99] Beginning cycle, collecting metrics...
I0206 18:16:49.388598 1 manager.go:95] Scraping metrics from 2 sources
I0206 18:16:49.395742 1 manager.go:120] Querying source: kubelet_summary:free5gc-virtual-machine
I0206 18:16:49.400574 1 manager.go:120] Querying source: kubelet_summary:master-node
I0206 18:16:49.413751 1 round_trippers.go:405] GET https://10.221.194.166:10250/stats/summary/ 200 OK in 13 milliseconds
I0206 18:16:49.414317 1 round_trippers.go:405] GET https://10.221.195.58:10250/stats/summary/ 200 OK in 18 milliseconds
I0206 18:16:49.417044 1 manager.go:150] ScrapeMetrics: time: 28.428677ms, nodes: 2, pods: 13
I0206 18:16:49.417062 1 manager.go:115] ...Storing metrics...
I0206 18:16:49.417083 1 manager.go:126] ...Cycle complete
Using the log output with v=10 I can see even the details of health of each pod, but nothing while running the kubectl get hpa or kubectl top nodes. Can someone give me a hint? Furthermore, my metrics manifest is:
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: metrics-server
namespace: kube-system
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: metrics-server
namespace: kube-system
labels:
k8s-app: metrics-server
spec:
selector:
matchLabels:
k8s-app: metrics-server
template:
metadata:
name: metrics-server
labels:
k8s-app: metrics-server
spec:
serviceAccountName: metrics-server
volumes:
# mount in tmp so we can safely use from-scratch images and/or read-only containers
- name: tmp-dir
emptyDir: {}
containers:
- name: metrics-server
image: k8s.gcr.io/metrics-server-amd64:v0.3.1
args:
- /metrics-server
- --metric-resolution=30s
- --requestheader-allowed-names=aggregator
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-insecure-tls
- --v=6
- --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
#- --kubelet-preferred-address-types=InternalIP
ports:
- name: main-port
containerPort: 4443
protocol: TCP
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
imagePullPolicy: Always
volumeMounts:
- name: tmp-dir
mountPath: /tmp
nodeSelector:
beta.kubernetes.io/os: linux
kubernetes.io/arch: "amd64"
And I can see the following:
free5gc#free5gc-virtual-machine:~/Desktop/metrics-server/deploy$ kubectl get apiservice v1beta1.metrics.k8s.io -o yaml
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
creationTimestamp: "2020-02-06T18:57:28Z"
name: v1beta1.metrics.k8s.io
resourceVersion: "45583"
selfLink: /apis/apiregistration.k8s.io/v1/apiservices/v1beta1.metrics.k8s.io
uid: ca439221-b987-4c13-b0e0-8d2bb237e612
spec:
group: metrics.k8s.io
groupPriorityMinimum: 100
insecureSkipTLSVerify: true
service:
name: metrics-server
namespace: kube-system
port: 443
version: v1beta1
versionPriority: 100
status:
conditions:
- lastTransitionTime: "2020-02-06T18:57:28Z"
message: 'failing or missing response from https://10.110.144.114:443/apis/metrics.k8s.io/v1beta1:
Get https://10.110.144.114:443/apis/metrics.k8s.io/v1beta1: dial tcp 10.110.144.114:443:
connect: no route to host'
reason: FailedDiscoveryCheck
status: "False"
type: Available

I have reproduced your issue (on Google Compute Engine). Tried a few scenarios to find workaround/solution for this issue.
First thing I want to mention is that you have provided ServiceAccount and Deployment YAML. You also need ClusterRoleBinding, RoleBinding, ApiService, etc. All needed YAMLs can be found in this Github repo.
For fast deploy metrics-server with all required config you can use:
$ git clone https://github.com/kubernetes-sigs/metrics-server.git
$ cd metrics-server/deploy/
$ kubectl apply -f kubernetes/
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
serviceaccount/metrics-server created
deployment.apps/metrics-server created
service/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
The second thing I would advise you to check your CNI pods (calico-node-d4rnh and calico-node-hawked). Created 4h11m but Ready 0/1.
Last thing regarding gathering CPU and Memory data from pods and nodes.
Using Calico
If you are using one-node kubeadm, it will work correctly, however, when you are using more than 1 node in kubeadm, this will cause some issues. There are many similar threads on Github regarding this. I've tried with various flags in args:, but no success. In metrics-server logs (-v=6) you will be able to see that metrics are gathering. In this Github thread, one of the Github users posted answer which is a workaround for this issue. It's also mentioned in K8s docs about hostNetwork.
Adding hostNetwork: true is what finally got metrics-server working for me. Without it, nada. Without the kubelet-preferred-address-types line, I could query my master node but not my two worker nodes, nor could I query pods, obviously undesirable results. Lack of kubelet-insecure-tls also results in an inoperable metrics-server installation.
spec:
hostNetwork: true
containers:
- args:
- --kubelet-insecure-tls
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP
- --v=6
image: k8s.gcr.io/metrics-server-amd64:v0.3.6
imagePullPolicy: Always
If you will deploy with this config it will work.
$ kubectl describe apiservice v1beta1.metrics.k8s.io
Name: v1beta1.metrics.k8s.io
...
Status:
Conditions:
Last Transition Time: 2020-02-20T09:37:59Z
Message: all checks passed
Reason: Passed
Status: True
Type: Available
Events: <none>
In addition, you can see the difference when using host network: true when you will check iptables. There is much more entries compare to deployment without this config.
After that, you can edit deployment, and remove or comment host network: true.
$ kubectl edit deploy metrics-server -n kube-system
deployment.apps/metrics-server edited
$ kubectl top pods
NAME CPU(cores) MEMORY(bytes)
nginx-6db489d4b7-2qhzw 0m 3Mi
nginx-6db489d4b7-9fvrj 0m 2Mi
nginx-6db489d4b7-dgbf9 0m 2Mi
nginx-6db489d4b7-dvcz5 0m 2Mi
Also, you will be able to find metrics using:
$ kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes
For better visibility you can use also jq.
$ kubectl get --raw /apis/metrics.k8s.io/v1beta1/pods | jq .
Using Weave Net
When you will use Weave Net and instead of Calico it will work without setting host network.
$ kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
However, you will need to work with certificates. But if you don't care about security, you can just use --kubelet-insecure-tls like in the previous example, when Calico was used.

Related

linkerd Top feature only shows /healthz requests

Doing Lab 7.2. Service Mesh and Ingress Controller from the Kubernetes Developer course from the Linux Foundation and there is a problem I am facing - the Top feature only shows the /healthz requests.
It is supposed to show / requests too. But does not. Would really like to troubleshoot it, but I have no idea how to even approach it.
More details
Following the course instructions I have:
A k8s cluster deployed on two GCE VMs
linkerd
nginx ingress controller
A simple LoadBalancer service off the httpd image. In effect, this is a NodePort service, since the LoadBalancer is never provisioned. The name is secondapp
A simple ingress object routing to the secondapp service.
I have no idea what information is useful to troubleshoot the issue. Here is some that I can think off:
Setup
Linkerd version
student#master:~$ linkerd version
Client version: stable-2.11.1
Server version: stable-2.11.1
student#master:~$
nginx ingress controller version
student#master:~$ helm list
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
myingress default 1 2022-09-28 02:09:35.031108611 +0000 UTC deployed ingress-nginx-4.2.5 1.3.1
student#master:~$
The service list
student#master:~$ k get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 7d4h
myingress-ingress-nginx-controller LoadBalancer 10.106.67.139 <pending> 80:32144/TCP,443:32610/TCP 62m
myingress-ingress-nginx-controller-admission ClusterIP 10.107.109.117 <none> 443/TCP 62m
nginx ClusterIP 10.105.88.244 <none> 443/TCP 3h42m
registry ClusterIP 10.110.129.139 <none> 5000/TCP 3h42m
secondapp LoadBalancer 10.105.64.242 <pending> 80:32000/TCP 111m
student#master:~$
Verifying that the ingress controller is known to linkerd
student#master:~$ k get ds myingress-ingress-nginx-controller -o json | jq .spec.template.metadata.annotations
{
"linkerd.io/inject": "ingress"
}
student#master:~$
The secondapp pod
apiVersion: v1
kind: Pod
metadata:
name: secondapp
labels:
example: second
spec:
containers:
- name: webserver
image: httpd
- name: busy
image: busybox
command:
- sleep
- "3600"
The secondapp service
student#master:~$ k get svc secondapp -o yaml
apiVersion: v1
kind: Service
metadata:
creationTimestamp: "2022-09-28T01:21:00Z"
name: secondapp
namespace: default
resourceVersion: "433221"
uid: 9266f000-5582-4796-ba73-02375f56ce2b
spec:
allocateLoadBalancerNodePorts: true
clusterIP: 10.105.64.242
clusterIPs:
- 10.105.64.242
externalTrafficPolicy: Cluster
internalTrafficPolicy: Cluster
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
ports:
- nodePort: 32000
port: 80
protocol: TCP
targetPort: 80
selector:
example: second
sessionAffinity: None
type: LoadBalancer
status:
loadBalancer: {}
student#master:~$
The ingress object
student#master:~$ k get ingress
NAME CLASS HOSTS ADDRESS PORTS AGE
ingress-test <none> www.example.com 80 65m
student#master:~$ k get ingress ingress-test -o yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
kubernetes.io/ingress.class: nginx
creationTimestamp: "2022-09-28T02:20:03Z"
generation: 1
name: ingress-test
namespace: default
resourceVersion: "438934"
uid: 1952a816-a3f3-42a4-b842-deb56053b168
spec:
rules:
- host: www.example.com
http:
paths:
- backend:
service:
name: secondapp
port:
number: 80
path: /
pathType: ImplementationSpecific
status:
loadBalancer: {}
student#master:~$
Testing
secondapp
student#master:~$ curl "$(curl ifconfig.io):$(k get svc secondapp '--template={{(index .spec.ports 0).nodePort}}')"
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 15 100 15 0 0 340 0 --:--:-- --:--:-- --:--:-- 348
<html><body><h1>It works!</h1></body></html>
student#master:~$
through the ingress controller
student#master:~$ url="$(curl ifconfig.io):$(k get svc myingress-ingress-nginx-controller '--template={{(index .spec.ports 0).nodePort}}')"
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 15 100 15 0 0 319 0 --:--:-- --:--:-- --:--:-- 319
student#master:~$ curl -H "Host: www.example.com" $url
<html><body><h1>It works!</h1></body></html>
student#master:~$
And without the Host header:
student#master:~$ curl $url
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>nginx</center>
</body>
</html>
student#master:~$
And finally the linkerd dashboard Top snapshot:
Where are the GET / requests?
EDIT 1
So on the linkerd slack someone suggested to have a look at https://linkerd.io/2.12/tasks/using-ingress/#nginx and that made me examine my pods more carefully. It turns out one of the nginx-ingress pods could not start and it is clearly due to linkerd injection. Please, observe:
Before linkerd
student#master:~$ k get pod
NAME READY STATUS RESTARTS AGE
myingress-ingress-nginx-controller-gbmbg 1/1 Running 0 19m
myingress-ingress-nginx-controller-qtdhw 1/1 Running 0 3m6s
secondapp 2/2 Running 4 (13m ago) 12h
student#master:~$
After linkerd
student#master:~$ k get ds myingress-ingress-nginx-controller -o yaml | linkerd inject --ingress - | k apply -f -
daemonset "myingress-ingress-nginx-controller" injected
daemonset.apps/myingress-ingress-nginx-controller configured
student#master:~$
And checking the pods:
student#master:~$ k get pod
NAME READY STATUS RESTARTS AGE
myingress-ingress-nginx-controller-gbmbg 1/1 Running 0 40m
myingress-ingress-nginx-controller-xhj5m 1/2 Running 8 (5m59s ago) 17m
secondapp 2/2 Running 4 (34m ago) 12h
student#master:~$
student#master:~$ k describe pod myingress-ingress-nginx-controller-xhj5m |tail
Normal Created 19m kubelet Created container linkerd-proxy
Normal Started 19m kubelet Started container linkerd-proxy
Normal Pulled 18m (x2 over 19m) kubelet Container image "registry.k8s.io/ingress-nginx/controller:v1.3.1#sha256:54f7fe2c6c5a9db9a0ebf1131797109bb7a4d91f56b9b362bde2abd237dd1974" already present on machine
Normal Created 18m (x2 over 19m) kubelet Created container controller
Normal Started 18m (x2 over 19m) kubelet Started container controller
Warning FailedPreStopHook 18m kubelet Exec lifecycle hook ([/wait-shutdown]) for Container "controller" in Pod "myingress-ingress-nginx-controller-xhj5m_default(93dd0189-091f-4c56-a197-33991932d66d)" failed - error: command '/wait-shutdown' exited with 137: , message: ""
Warning Unhealthy 18m (x6 over 19m) kubelet Readiness probe failed: HTTP probe failed with statuscode: 502
Normal Killing 18m kubelet Container controller failed liveness probe, will be restarted
Warning Unhealthy 14m (x30 over 19m) kubelet Liveness probe failed: HTTP probe failed with statuscode: 502
Warning BackOff 4m29s (x41 over 14m) kubelet Back-off restarting failed container
student#master:~$
I will process the link I was given on the linkerd slack and update this post with any new findings.
The solution was provided by the Axenow user on the linkerd2 slack forum. The problem is that ingress-nginx cannot share the namespace with the services it provides the ingress functionality to. In my case all of them were in the default namespace.
To quote Axenow:
When you deploy nginx, by default it send traffic to the pod directly.
To fix it you have to make this configuration:
https://linkerd.io/2.12/tasks/using-ingress/#nginx
To elaborate, one has to update the values.yaml file of the downloaded ingress-nginx helm chart to make sure the following is true:
controller:
replicaCount: 2
service:
externalTrafficPolicy: Cluster
podAnnotations:
linkerd.io/inject: enabled
And install the controller in a dedicated namespace:
helm upgrade --install --create-namespace --namespace ingress-nginx -f values.yaml ingress-nginx ingress-nginx/ingress-nginx
(Having uninstalled the previous installation, of course)

Metrics server is currently unable to handle the request

I am new to kubernetes and was trying to apply horizontal pod autoscaling to my existing application. and after following other stackoverflow details - got to know that I need to install metric-server - and I was able to - but some how it's not working and unable to handle request.
Further I followed few more things but unable to resolve the issue - I will really appreciate any help here.
Please let me know for any further details you need for helping me :) Thanks in advance.
Steps followed:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
serviceaccount/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
service/metrics-server created
deployment.apps/metrics-server created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
kubectl get deploy,svc -n kube-system | egrep metrics-server
deployment.apps/metrics-server 1/1 1 1 2m6s
service/metrics-server ClusterIP 10.32.0.32 <none> 443/TCP 2m6s
kubectl get pods -n kube-system | grep metrics-server
metrics-server-64cf6869bd-6gx88 1/1 Running 0 2m39s
vi ana_hpa.yaml
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: ana-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: StatefulSet
name: common-services-auth
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 160
k apply -f ana_hpa.yaml
horizontalpodautoscaler.autoscaling/ana-hpa created
k get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
ana-hpa StatefulSet/common-services-auth <unknown>/160%, <unknown>/80% 1 10 0 4s
k describe hpa ana-hpa
Name: ana-hpa
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Tue, 12 Apr 2022 17:01:25 +0530
Reference: StatefulSet/common-services-auth
Metrics: ( current / target )
resource memory on pods (as a percentage of request): <unknown> / 160%
resource cpu on pods (as a percentage of request): <unknown> / 80%
Min replicas: 1
Max replicas: 10
StatefulSet pods: 3 current / 0 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededGetScale the HPA controller was able to get the target's current scale
ScalingActive False FailedGetResourceMetric the HPA was unable to compute the replica count: failed to get memory utilization: unable to get metrics for resource memory: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetResourceMetric 38s (x8 over 2m23s) horizontal-pod-autoscaler failed to get cpu utilization: unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
Warning FailedComputeMetricsReplicas 38s (x8 over 2m23s) horizontal-pod-autoscaler invalid metrics (2 invalid out of 2), first error is: failed to get memory utilization: unable to get metrics for resource memory: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
Warning FailedGetResourceMetric 23s (x9 over 2m23s) horizontal-pod-autoscaler failed to get memory utilization: unable to get metrics for resource memory: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
kubectl get --raw /apis/metrics.k8s.io/v1beta1
Error from server (ServiceUnavailable): the server is currently unable to handle the request
kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes"
Error from server (ServiceUnavailable): the server is currently unable to handle the request
kubectl edit deployments.apps -n kube-system metrics-server
Add hostNetwork: true
deployment.apps/metrics-server edited
kubectl get pods -n kube-system | grep metrics-server
metrics-server-5dc6dbdb8-42hw9 1/1 Running 0 10m
k describe pod metrics-server-5dc6dbdb8-42hw9 -n kube-system
Name: metrics-server-5dc6dbdb8-42hw9
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Node: pusntyn196.apac.avaya.com/10.133.85.196
Start Time: Tue, 12 Apr 2022 17:08:25 +0530
Labels: k8s-app=metrics-server
pod-template-hash=5dc6dbdb8
Annotations: <none>
Status: Running
IP: 10.133.85.196
IPs:
IP: 10.133.85.196
Controlled By: ReplicaSet/metrics-server-5dc6dbdb8
Containers:
metrics-server:
Container ID: containerd://024afb1998dce4c0bd5f4e58f996068ea37982bd501b54fda2ef8d5c1098b4f4
Image: k8s.gcr.io/metrics-server/metrics-server:v0.6.1
Image ID: k8s.gcr.io/metrics-server/metrics-server#sha256:5ddc6458eb95f5c70bd13fdab90cbd7d6ad1066e5b528ad1dcb28b76c5fb2f00
Port: 4443/TCP
Host Port: 4443/TCP
Args:
--cert-dir=/tmp
--secure-port=4443
--kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
--kubelet-use-node-status-port
--metric-resolution=15s
State: Running
Started: Tue, 12 Apr 2022 17:08:26 +0530
Ready: True
Restart Count: 0
Requests:
cpu: 100m
memory: 200Mi
Liveness: http-get https://:https/livez delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get https://:https/readyz delay=20s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/tmp from tmp-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-g6p4g (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
tmp-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
kube-api-access-g6p4g:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 2s
node.kubernetes.io/unreachable:NoExecute op=Exists for 2s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m31s default-scheduler Successfully assigned kube-system/metrics-server-5dc6dbdb8-42hw9 to pusntyn196.apac.avaya.com
Normal Pulled 2m32s kubelet Container image "k8s.gcr.io/metrics-server/metrics-server:v0.6.1" already present on machine
Normal Created 2m31s kubelet Created container metrics-server
Normal Started 2m31s kubelet Started container metrics-server
kubectl get --raw /apis/metrics.k8s.io/v1beta1
Error from server (ServiceUnavailable): the server is currently unable to handle the request
kubectl get pods -n kube-system | grep metrics-server
metrics-server-5dc6dbdb8-42hw9 1/1 Running 0 10m
kubectl logs -f metrics-server-5dc6dbdb8-42hw9 -n kube-system
E0412 11:43:54.684784 1 configmap_cafile_content.go:242] kube-system/extension-apiserver-authentication failed with : missing content for CA bundle "client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
E0412 11:44:27.001010 1 configmap_cafile_content.go:242] key failed with : missing content for CA bundle "client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
k logs -f metrics-server-5dc6dbdb8-42hw9 -n kube-system
I0412 11:38:26.447305 1 serving.go:342] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
I0412 11:38:26.899459 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0412 11:38:26.899477 1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I0412 11:38:26.899518 1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
I0412 11:38:26.899545 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0412 11:38:26.899546 1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
I0412 11:38:26.899567 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0412 11:38:26.900480 1 dynamic_serving_content.go:131] "Starting controller" name="serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key"
I0412 11:38:26.900811 1 secure_serving.go:266] Serving securely on [::]:4443
I0412 11:38:26.900854 1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
W0412 11:38:26.900965 1 shared_informer.go:372] The sharedIndexInformer has started, run more than once is not allowed
I0412 11:38:26.999960 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0412 11:38:26.999989 1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController
I0412 11:38:26.999970 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
E0412 11:38:27.000087 1 configmap_cafile_content.go:242] kube-system/extension-apiserver-authentication failed with : missing content for CA bundle "client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
E0412 11:38:27.000118 1 configmap_cafile_content.go:242] key failed with : missing content for CA bundle "client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
kubectl top nodes
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)
kubectl top pods
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods.metrics.k8s.io)
Edit metrics server deployment yaml
Add - --kubelet-insecure-tls
k apply -f metric-server-deployment.yaml
serviceaccount/metrics-server unchanged
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader unchanged
clusterrole.rbac.authorization.k8s.io/system:metrics-server unchanged
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader unchanged
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator unchanged
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server unchanged
service/metrics-server unchanged
deployment.apps/metrics-server configured
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io unchanged
kubectl get pods -n kube-system | grep metrics-server
metrics-server-5dc6dbdb8-42hw9 1/1 Running 0 10m
kubectl top pods
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods.metrics.k8s.io)
Also tried by adding below to metrics server deployment
command:
- /metrics-server
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
This can easily be resolved by editing the deployment yaml files and adding the hostNetwork: true after the dnsPolicy: ClusterFirst
kubectl edit deployments.apps -n kube-system metrics-server
insert:
hostNetwork: true
I hope this help somebody for bare metal cluster:
$ helm --repo https://kubernetes-sigs.github.io/metrics-server/ --kubeconfig=$HOME/.kube/loc-cluster.config -n kube-system --set args='{--kubelet-insecure-tls}' upgrade --install metrics-server metrics-server
$ helm --kubeconfig=$HOME/.kube/loc-cluster.config -n kube-system uninstall metrics-server
Update: I deployed the metrics-server using the same command. Perhaps you can start fresh by removing existing resources and running:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
=======================================================================
It appears the --kubelet-insecure-tls flag was not configured correctly for the pod template in the deployment. The following should fix this:
Edit the existing deployment in the cluster with kubectl edit deployment/metrics-server -nkube-system.
Add the flag to the spec.containers[].args list, so that the deployment looks like this:
...
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
- --kubelet-insecure-tls <=======ADD IT HERE.
image: k8s.gcr.io/metrics-server/metrics-server:v0.6.1
...
Simply save your changes and let the deployment rollout the updated pods. You can use watch -n1 kubectl get deployment/kube-metrics -nkube-system and wait for UP-TO-DATE column to show 1.
Like this:
NAME READY UP-TO-DATE AVAILABLE AGE
metrics-server 1/1 1 1 16m
Verify with kubectl top nodes. It will show something like
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
docker-desktop 222m 5% 1600Mi 41%
I've just verified this to work on a local setup. Let me know if this helps :)
Please configuration aggregation layer correctly and carefully, you can use this link for help : https://kubernetes.io/docs/tasks/extend-kubernetes/configure-aggregation-layer/.
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
name: <name of the registration object>
spec:
group: <API group name this extension apiserver hosts>
version: <API version this extension apiserver hosts>
groupPriorityMinimum: <priority this APIService for this group, see API documentation>
versionPriority: <prioritizes ordering of this version within a group, see API documentation>
service:
namespace: <namespace of the extension apiserver service>
name: <name of the extension apiserver service>
caBundle: <pem encoded ca cert that signs the server cert used by the webhook>
It would be helpful to provide kubectl version return value.
For me on EKS with helmfile I had to write in the values.yaml using the metrics-server chart :
containerPort: 10250
The value was enforced by default to 4443 for an unknown reason when I first deployed the chart.
See doc:
https://github.com/kubernetes-sigs/metrics-server/blob/master/charts/metrics-server/values.yaml#L62
https://aws.amazon.com/premiumsupport/knowledge-center/eks-metrics-server/#:~:text=confirm%20that%20your%20security%20groups
Then kubectl top nodes and kubectl describe apiservice v1beta1.metrics.k8s.io were working.
First of all, execute the following command:
kubectl get apiservices
And checkout the availablity (status) of kube-system/metrics-server service.
In case the availability is True:
Add hostNetwork: true to the spec of your metrics-server deployment by executing the following command:
kubectl edit deployment -n kube-system metrics-server
It should look like the following:
...
spec:
hostNetwork: true
...
Setting hostNetwork to true means that Pod will have access to
the host where it's running.
In case the availability is False (MissingEndpoints):
Download metrics-server:
wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.5.0/components.yaml
Remove (legacy) metrics server:
kubectl delete -f components.yaml
Edit downloaded file and add - --kubelet-insecure-tls to args list:
...
labels:
k8s-app: metrics-server
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
- --kubelet-insecure-tls # add this line
...
Create service once again:
kubectl apply -f components.yaml

Enable use of images from the local library on Kubernetes

I'm following a tutorial https://docs.openfaas.com/tutorials/first-python-function/,
currently, I have the right image
$ docker images | grep hello-openfaas
wm/hello-openfaas latest bd08d01ce09b 34 minutes ago 65.2MB
$ faas-cli deploy -f ./hello-openfaas.yml
Deploying: hello-openfaas.
WARNING! You are not using an encrypted connection to the gateway, consider using HTTPS.
Deployed. 202 Accepted.
URL: http://IP:8099/function/hello-openfaas
there is a step that forewarns me to do some setup(My case is I'm using Kubernetes and minikube and don't want to push to a remote container registry, I should enable the use of images from the local library on Kubernetes.), I see the hints
see the helm chart for how to set the ImagePullPolicy
I'm not sure how to configure it correctly. the final result indicates I failed.
Unsurprisingly, I couldn't access the function service, I find some clues in https://docs.openfaas.com/deployment/troubleshooting/#openfaas-didnt-start which might help to diagnose the problem.
$ kubectl logs -n openfaas-fn deploy/hello-openfaas
Error from server (BadRequest): container "hello-openfaas" in pod "hello-openfaas-558f99477f-wd697" is waiting to start: trying and failing to pull image
$ kubectl describe -n openfaas-fn deploy/hello-openfaas
Name: hello-openfaas
Namespace: openfaas-fn
CreationTimestamp: Wed, 16 Mar 2022 14:59:49 +0800
Labels: faas_function=hello-openfaas
Annotations: deployment.kubernetes.io/revision: 1
prometheus.io.scrape: false
Selector: faas_function=hello-openfaas
Replicas: 1 desired | 1 updated | 1 total | 0 available | 1 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 0 max unavailable, 1 max surge
Pod Template:
Labels: faas_function=hello-openfaas
Annotations: prometheus.io.scrape: false
Containers:
hello-openfaas:
Image: wm/hello-openfaas:latest
Port: 8080/TCP
Host Port: 0/TCP
Liveness: http-get http://:8080/_/health delay=2s timeout=1s period=2s #success=1 #failure=3
Readiness: http-get http://:8080/_/health delay=2s timeout=1s period=2s #success=1 #failure=3
Environment:
fprocess: python3 index.py
Mounts: <none>
Volumes: <none>
Conditions:
Type Status Reason
---- ------ ------
Available False MinimumReplicasUnavailable
Progressing False ProgressDeadlineExceeded
OldReplicaSets: <none>
NewReplicaSet: hello-openfaas-558f99477f (1/1 replicas created)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 29m deployment-controller Scaled up replica set hello-openfaas-558f99477f to 1
hello-openfaas.yml
version: 1.0
provider:
name: openfaas
gateway: http://IP:8099
functions:
hello-openfaas:
lang: python3
handler: ./hello-openfaas
image: wm/hello-openfaas:latest
imagePullPolicy: Never
I create a new project hello-openfaas2 to reproduce this error
$ faas-cli new --lang python3 hello-openfaas2 --prefix="wm"
Folder: hello-openfaas2 created.
# I add `imagePullPolicy: Never` to `hello-openfaas2.yml`
$ faas-cli build -f ./hello-openfaas2.yml
$ faas-cli deploy -f ./hello-openfaas2.yml
Deploying: hello-openfaas2.
WARNING! You are not using an encrypted connection to the gateway, consider using HTTPS.
Deployed. 202 Accepted.
URL: http://192.168.1.3:8099/function/hello-openfaas2
$ kubectl logs -n openfaas-fn deploy/hello-openfaas2
Error from server (BadRequest): container "hello-openfaas2" in pod "hello-openfaas2-7c67488865-7d7vm" is waiting to start: image can't be pulled
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-64897985d-kp7vf 1/1 Running 0 47h
...
openfaas-fn env-6c79f7b946-bzbtm 1/1 Running 0 4h28m
openfaas-fn figlet-54db496f88-957xl 1/1 Running 0 18h
openfaas-fn hello-openfaas-547857b9d6-z277c 0/1 ImagePullBackOff 0 127m
openfaas-fn hello-openfaas-7b6946b4f9-hcvq4 0/1 ImagePullBackOff 0 165m
openfaas-fn hello-openfaas2-7c67488865-qmrkl 0/1 ImagePullBackOff 0 13m
openfaas-fn hello-openfaas3-65847b8b67-b94kd 0/1 ImagePullBackOff 0 97m
openfaas-fn hello-python-554b464498-zxcdv 0/1 ErrImagePull 0 3h23m
openfaas-fn hello-python-8698bc68bd-62gh9 0/1 ImagePullBackOff 0 3h25m
from https://docs.openfaas.com/reference/yaml/, I know I put the imagePullPolicy in the wrong place, there is no such keyword in its schema.
I also tried eval $(minikube docker-env and still get the same error.
I've a feeling that faas-cli deploy can be replace by helm, they all mean to run the image(whether from remote or local) in Kubernetes cluster, then I can use helm chart to setup the pullPolicy there. Even though the detail is not still clear to me, This discovery inspires me.
So far, after eval $(minikube docker-env)
$ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
wm/hello-openfaas2 0.1 03c21bd96d5e About an hour ago 65.2MB
python 3-alpine 69fba17b9bae 12 days ago 48.6MB
ghcr.io/openfaas/figlet latest ca5eef0de441 2 weeks ago 14.8MB
ghcr.io/openfaas/alpine latest 35f3d4be6bb8 2 weeks ago 14.2MB
ghcr.io/openfaas/faas-netes 0.14.2 524b510505ec 3 weeks ago 77.3MB
k8s.gcr.io/kube-apiserver v1.23.3 f40be0088a83 7 weeks ago 135MB
k8s.gcr.io/kube-controller-manager v1.23.3 b07520cd7ab7 7 weeks ago 125MB
k8s.gcr.io/kube-scheduler v1.23.3 99a3486be4f2 7 weeks ago 53.5MB
k8s.gcr.io/kube-proxy v1.23.3 9b7cc9982109 7 weeks ago 112MB
ghcr.io/openfaas/gateway 0.21.3 ab4851262cd1 7 weeks ago 30.6MB
ghcr.io/openfaas/basic-auth 0.21.3 16e7168a17a3 7 weeks ago 14.3MB
k8s.gcr.io/etcd 3.5.1-0 25f8c7f3da61 4 months ago 293MB
ghcr.io/openfaas/classic-watchdog 0.2.0 6f97aa96da81 4 months ago 8.18MB
k8s.gcr.io/coredns/coredns v1.8.6 a4ca41631cc7 5 months ago 46.8MB
k8s.gcr.io/pause 3.6 6270bb605e12 6 months ago 683kB
ghcr.io/openfaas/queue-worker 0.12.2 56e7216201bc 7 months ago 7.97MB
kubernetesui/dashboard v2.3.1 e1482a24335a 9 months ago 220MB
kubernetesui/metrics-scraper v1.0.7 7801cfc6d5c0 9 months ago 34.4MB
nats-streaming 0.22.0 12f2d32e0c9a 9 months ago 19.8MB
gcr.io/k8s-minikube/storage-provisioner v5 6e38f40d628d 11 months ago 31.5MB
functions/markdown-render latest 93b5da182216 2 years ago 24.6MB
functions/hubstats latest 01affa91e9e4 2 years ago 29.3MB
functions/nodeinfo latest 2fe8a87bf79c 2 years ago 71.4MB
functions/alpine latest 46c6f6d74471 2 years ago 21.5MB
prom/prometheus v2.11.0 b97ed892eb23 2 years ago 126MB
prom/alertmanager v0.18.0 ce3c87f17369 2 years ago 51.9MB
alexellis2/openfaas-colorization 0.4.1 d36b67b1b5c1 2 years ago 1.84GB
rorpage/text-to-speech latest 5dc20810eb54 2 years ago 86.9MB
stefanprodan/faas-grafana 4.6.3 2a4bd9caea50 4 years ago 284MB
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-64897985d-kp7vf 1/1 Running 0 6d
kube-system etcd-minikube 1/1 Running 0 6d
kube-system kube-apiserver-minikube 1/1 Running 0 6d
kube-system kube-controller-manager-minikube 1/1 Running 0 6d
kube-system kube-proxy-5m8lr 1/1 Running 0 6d
kube-system kube-scheduler-minikube 1/1 Running 0 6d
kube-system storage-provisioner 1/1 Running 1 (6d ago) 6d
kubernetes-dashboard dashboard-metrics-scraper-58549894f-97tsv 1/1 Running 0 5d7h
kubernetes-dashboard kubernetes-dashboard-ccd587f44-lkwcx 1/1 Running 0 5d7h
openfaas-fn base64-6bdbcdb64c-djz8f 1/1 Running 0 5d1h
openfaas-fn colorise-85c74c686b-2fz66 1/1 Running 0 4d5h
openfaas-fn echoit-5d7df6684c-k6ljn 1/1 Running 0 5d1h
openfaas-fn env-6c79f7b946-bzbtm 1/1 Running 0 4d5h
openfaas-fn figlet-54db496f88-957xl 1/1 Running 0 4d19h
openfaas-fn hello-openfaas-547857b9d6-z277c 0/1 ImagePullBackOff 0 4d3h
openfaas-fn hello-openfaas-7b6946b4f9-hcvq4 0/1 ImagePullBackOff 0 4d3h
openfaas-fn hello-openfaas2-5c6f6cb5d9-24hkz 0/1 ImagePullBackOff 0 9m22s
openfaas-fn hello-openfaas2-8957bb47b-7cgjg 0/1 ImagePullBackOff 0 2d22h
openfaas-fn hello-openfaas3-65847b8b67-b94kd 0/1 ImagePullBackOff 0 4d2h
openfaas-fn hello-python-6d6976845f-cwsln 0/1 ImagePullBackOff 0 3d19h
openfaas-fn hello-python-b577cb8dc-64wf5 0/1 ImagePullBackOff 0 3d9h
openfaas-fn hubstats-b6cd4dccc-z8tvl 1/1 Running 0 5d1h
openfaas-fn markdown-68f69f47c8-w5m47 1/1 Running 0 5d1h
openfaas-fn nodeinfo-d48cbbfcc-hfj79 1/1 Running 0 5d1h
openfaas-fn openfaas2-fun 1/1 Running 0 15s
openfaas-fn text-to-speech-74ffcdfd7-997t4 0/1 CrashLoopBackOff 2235 (3s ago) 4d5h
openfaas-fn wordcount-6489865566-cvfzr 1/1 Running 0 5d1h
openfaas alertmanager-88449c789-fq2rg 1/1 Running 0 3d1h
openfaas basic-auth-plugin-75fd7d69c5-zw4jh 1/1 Running 0 3d2h
openfaas gateway-5c4bb7c5d7-n8h27 2/2 Running 0 3d2h
openfaas grafana 1/1 Running 0 4d8h
openfaas nats-647b476664-hkr7p 1/1 Running 0 3d2h
openfaas prometheus-687648749f-tl8jp 1/1 Running 0 3d1h
openfaas queue-worker-7777ffd7f6-htx6t 1/1 Running 0 3d2h
$ kubectl get -o yaml -n openfaas-fn deploy/hello-openfaas2
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "6"
prometheus.io.scrape: "false"
creationTimestamp: "2022-03-17T12:47:35Z"
generation: 6
labels:
faas_function: hello-openfaas2
name: hello-openfaas2
namespace: openfaas-fn
resourceVersion: "400833"
uid: 9c4e9d26-23af-4f93-8538-4e2d96f0d7e0
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
faas_function: hello-openfaas2
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
type: RollingUpdate
template:
metadata:
annotations:
prometheus.io.scrape: "false"
creationTimestamp: null
labels:
faas_function: hello-openfaas2
uid: "969512830"
name: hello-openfaas2
spec:
containers:
- env:
- name: fprocess
value: python3 index.py
image: wm/hello-openfaas2:0.1
imagePullPolicy: Always
livenessProbe:
failureThreshold: 3
httpGet:
path: /_/health
port: 8080
scheme: HTTP
initialDelaySeconds: 2
periodSeconds: 2
successThreshold: 1
timeoutSeconds: 1
name: hello-openfaas2
ports:
- containerPort: 8080
name: http
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /_/health
port: 8080
scheme: HTTP
initialDelaySeconds: 2
periodSeconds: 2
successThreshold: 1
timeoutSeconds: 1
resources: {}
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: false
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
enableServiceLinks: false
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
status:
conditions:
- lastTransitionTime: "2022-03-17T12:47:35Z"
lastUpdateTime: "2022-03-17T12:47:35Z"
message: Deployment does not have minimum availability.
reason: MinimumReplicasUnavailable
status: "False"
type: Available
- lastTransitionTime: "2022-03-20T12:16:56Z"
lastUpdateTime: "2022-03-20T12:16:56Z"
message: ReplicaSet "hello-openfaas2-5d6c7c7fb4" has timed out progressing.
reason: ProgressDeadlineExceeded
status: "False"
type: Progressing
observedGeneration: 6
replicas: 2
unavailableReplicas: 2
updatedReplicas: 1
In one shell,
docker#minikube:~$ docker run --name wm -ti wm/hello-openfaas2:0.1
2022/03/20 13:04:52 Version: 0.2.0 SHA: 56bf6aac54deb3863a690f5fc03a2a38e7d9e6ef
2022/03/20 13:04:52 Timeouts: read: 5s write: 5s hard: 0s health: 5s.
2022/03/20 13:04:52 Listening on port: 8080
...
and another shell
docker#minikube:~$ docker ps | grep wm
d7796286641c wm/hello-openfaas2:0.1 "fwatchdog" 3 minutes ago Up 3 minutes (healthy) 8080/tcp wm
When you specify an image to pull from without a url, this defaults to DockerHub. When you use :latest tag, it will always pull the latest image regardless of what pull policy is defined.
So to use local built images - don't use the latest tag.
To make minikube pull images from your local machine, you need to do few things:
Point your docker client to the VM's docker daemon: eval $(minikube docker-env)
Configure image pull policy: imagePullPolicy: Never
There is a flag to pass in to use insecure registries in minikube VM. This must be specified when you create the machine: minikube start --insecure-registry
Note you have to run eval eval $(minikube docker-env) on each terminal you want to use, since it only sets the environment variables for the current shell session.
This flow works:
# Start minikube and set docker env
minikube start
eval $(minikube docker-env)
# Build image
docker build -t foo:1.0 .
# Run in minikube
kubectl run hello-foo --image=foo:1.0 --image-pull-policy=Never
You can read more at the minikube docs.
If your image has a latest tag, the Pod's ImagePullPolicy will be automatically set to Always. Each time the pod is created, Kubernetes tries to pull the newest image.
Try not tagging the image as latest or manually setting the Pod's ImagePullPolicy to Never.
If you're using static manifest to create a Pod, the setting will be like the following:
containers:
- name: test-container
image: testImage:latest
imagePullPolicy: Never
From comments in initial post, I gathered that:
The issue is that the container runtime from your Minikube cluster is distinct from that of your host, where you have built your function image (not always the case: minikube can run with docker driver, which, I think implies the host docker runtime is shared with cluster)
the container runtime in use by Minikube is docker (could have been cri-o / following steps won't apply to that case. Those using crio may switch to docker, as I'm not sure image loading is possible with cri-o )
You can try to build your function image from a shell inside your Minikube instance.
Or you can:
export your image ( docker save -o image.tar my/image )
copy this to your minikube instance ( scp -i ~/.minikube/machines/minikube/id_rsa image.tar docker#$(minikube ip): )
open a shell ( ssh -i ~/.minikube/machines/minikube/id_rsa docker#$(minikube ip) )
load that image ( docker load -i image.tar )
Then, make sure your openfaas was deployed with faasnetes.imagePullPolicy=Never or IfNotPresent, as I doubt setting the imagePullPolicy directly in your function would do (haven't read about this in their docs, which instead mentions, as you pointed it out, to override this during openfaas deployment). Checking your deployment yaml definition ( kubectl get -o yaml -n openpaas-fn deploy/hello-openfaas ) should confirm you're not using Always: if that's already the case, no need to dig further: just make sure your image is imported, with name and tag matching that referenced by your function.
... Answering your last comment: you're not sure how openfaas was deployed. One way to make sure the proper option was set would be to look at the gateway deployment, in openfaas namespace ( kubectl get -o yaml -n openfaas deploy/gateway ).
In there, you should find a container named "operator". That container should include a few environment variables, one of which may be image_pull_policy. (we can see this looking at the Chart sources ). You want that environment variable to be set to IfNotPresent, add it or edit it if needed.
Checking your last edit, we can see the Deployment object created by your function says:
image: wm/hello-openfaas2:0.1
imagePullPolicy: Always
So for sure: you do need to reconfigure openfaas, adding that image_pull_policy environment variable.

kubernetes metrics server don't start

I try to connect in the dashboard of kubernetes.
I have the latest version of kubernetes v1.12 with kubeadm , in a server.
I download from github the metrics-server and run:
Kubctl create -f deploy/1.8+
but i get this error
kube-system metrics-server-5cbbc84f8c-tjfxd 0/1 Pending 0 12m
with out log to debug
error: the server doesn't have a resource type "logs"
I don't want to install heapster because is DEPRECATED.
UPDATE
Hello, and thanks.
i run the taint command i get:
error: at least one taint update is required
and the command
kubectl describe deployment metrics-server -n kube-system
i get this output:
Name: metrics-server
Namespace: kube-system
CreationTimestamp: Thu, 18 Oct 2018 14:34:42 +0000
Labels: k8s-app=metrics-server
Annotations: deployment.kubernetes.io/revision: 1
kubectl.kubernetes.io/last-applied-configuration: {"apiVersion":"extensions/v1beta1","kind":"Deployment","metadata": {"annotations":{},"labels":{"k8s-app":"metrics-server"},"name":"metrics-...
Selector: k8s-app=metrics-server
Replicas: 1 desired | 1 updated | 1 total | 0 available | 1 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 1 max unavailable, 1 max surge
Pod Template:
Labels: k8s-app=metrics-server
Service Account: metrics-server
Containers:
metrics-server:
Image: k8s.gcr.io/metrics-server-amd64:v0.3.1
Port: <none>
Host Port: <none>
Environment: <none>
Mounts:
/tmp from tmp-dir (rw)
Volumes:
tmp-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
OldReplicaSets: <none>
NewReplicaSet: metrics-server-5cbbc84f8c (1/1 replicas created)
Events: <none>
Command:
kubectl get nodes
The output for this is just the IP of the node, and nothing special.
Any ideas, or what to do to work the dashboard for kubernetes.
I suppose you are trying setup metrics-server on your master node.
If you issue kubectl describe deployment metrics-server -n kube-system I believe you will see something like this:
Name: metrics-server Namespace:
kube-system CreationTimestamp: Thu, 18 Oct 2018 15:57:34 +0000
Labels: k8s-app=metrics-server Annotations:
deployment.kubernetes.io/revision: 1 Selector:
k8s-app=metrics-server Replicas: 1 desired | 1 updated |
1 total | 0 available | 1 unavailable
But if you will describe your node you will see taint that prevent you from scheduling new pods on master node:
kubectl get nodes
NAME STATUS ROLES AGE VERSION
kube-master-1 Ready master 17m v1.12.1
kubectl describe node kube-master-1
Name: kube-master-1
...
Taints: node-role.kubernetes.io/master:NoSchedule
You have to remove this taint:
kubectl taint node kube-master-1 node-role.kubernetes.io/master:NoSchedule-
node/kube-master-1 untainted
Result:
kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-node-xvc77 2/2 Running 0 20m
kube-system coredns-576cbf47c7-rj4wh 1/1 Running 0 21m
kube-system coredns-576cbf47c7-vsjsf 1/1 Running 0 21m
kube-system etcd-kube-master-1 1/1 Running 0 20m
kube-system kube-apiserver-kube-master-1 1/1 Running 0 20m
kube-system kube-controller-manager-kube-master-1 1/1 Running 0 20m
kube-system kube-proxy-xp5zh 1/1 Running 0 21m
kube-system kube-scheduler-kube-master-1 1/1 Running 0 20m
kube-system metrics-server-5cbbc84f8c-l2t76 1/1 Running 0 18m
But this is not the best approach. Good approach is to join worker and set up metrics-server there. There won't be any issues and there is no need to touch taint on master node.
Hope it will help you.
The above answer by "Vit" is correct, either remove taint from existing node group or create new node group without any taint.

kubernetes cluster master node not ready

i do not know why ,my master node in not ready status,all pods on cluster run normally, and i use cabernets v1.7.5 ,and network plugin use calico,and os version is "centos7.2.1511"
# kubectl get nodes
NAME STATUS AGE VERSION
k8s-node1 Ready 1h v1.7.5
k8s-node2 NotReady 1h v1.7.5
# kubectl get all --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system po/calico-node-11kvm 2/2 Running 0 33m
kube-system po/calico-policy-controller-1906845835-1nqjj 1/1 Running 0 33m
kube-system po/calicoctl 1/1 Running 0 33m
kube-system po/etcd-k8s-node2 1/1 Running 1 15m
kube-system po/kube-apiserver-k8s-node2 1/1 Running 1 15m
kube-system po/kube-controller-manager-k8s-node2 1/1 Running 2 15m
kube-system po/kube-dns-2425271678-2mh46 3/3 Running 0 1h
kube-system po/kube-proxy-qlmbx 1/1 Running 1 1h
kube-system po/kube-proxy-vwh6l 1/1 Running 0 1h
kube-system po/kube-scheduler-k8s-node2 1/1 Running 2 15m
NAMESPACE NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default svc/kubernetes 10.96.0.1 <none> 443/TCP 1h
kube-system svc/kube-dns 10.96.0.10 <none> 53/UDP,53/TCP 1h
NAMESPACE NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
kube-system deploy/calico-policy-controller 1 1 1 1 33m
kube-system deploy/kube-dns 1 1 1 1 1h
NAMESPACE NAME DESIRED CURRENT READY AGE
kube-system rs/calico-policy-controller-1906845835 1 1 1 33m
kube-system rs/kube-dns-2425271678 1 1 1 1h
update
it seems master node can not recognize the calico network plugin, i use kubeadm to install k8s cluster ,due to kubeadm start etcd on 127.0.0.1:2379 on master node,and calico on other nodes can not talk with etcd,so i modify etcd.yaml as following ,and all calico pods run fine, i do not very familiar with calico ,how to fix it ?
apiVersion: v1
kind: Pod
metadata:
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ""
creationTimestamp: null
labels:
component: etcd
tier: control-plane
name: etcd
namespace: kube-system
spec:
containers:
- command:
- etcd
- --listen-client-urls=http://127.0.0.1:2379,http://10.161.233.80:2379
- --advertise-client-urls=http://10.161.233.80:2379
- --data-dir=/var/lib/etcd
image: gcr.io/google_containers/etcd-amd64:3.0.17
livenessProbe:
failureThreshold: 8
httpGet:
host: 127.0.0.1
path: /health
port: 2379
scheme: HTTP
initialDelaySeconds: 15
timeoutSeconds: 15
name: etcd
resources: {}
volumeMounts:
- mountPath: /etc/ssl/certs
name: certs
- mountPath: /var/lib/etcd
name: etcd
- mountPath: /etc/kubernetes
name: k8s
readOnly: true
hostNetwork: true
volumes:
- hostPath:
path: /etc/ssl/certs
name: certs
- hostPath:
path: /var/lib/etcd
name: etcd
- hostPath:
path: /etc/kubernetes
name: k8s
status: {}
[root#k8s-node2 calico]# kubectl describe node k8s-node2
Name: k8s-node2
Role:
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/hostname=k8s-node2
node-role.kubernetes.io/master=
Annotations: node.alpha.kubernetes.io/ttl=0
volumes.kubernetes.io/controller-managed-attach-detach=true
Taints: node-role.kubernetes.io/master:NoSchedule
CreationTimestamp: Tue, 12 Sep 2017 15:20:57 +0800
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
OutOfDisk False Wed, 13 Sep 2017 10:25:58 +0800 Tue, 12 Sep 2017 15:20:57 +0800 KubeletHasSufficientDisk kubelet has sufficient disk space available
MemoryPressure False Wed, 13 Sep 2017 10:25:58 +0800 Tue, 12 Sep 2017 15:20:57 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Wed, 13 Sep 2017 10:25:58 +0800 Tue, 12 Sep 2017 15:20:57 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure
Ready False Wed, 13 Sep 2017 10:25:58 +0800 Tue, 12 Sep 2017 15:20:57 +0800 KubeletNotReady runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Addresses:
InternalIP: 10.161.233.80
Hostname: k8s-node2
Capacity:
cpu: 2
memory: 3618520Ki
pods: 110
Allocatable:
cpu: 2
memory: 3516120Ki
pods: 110
System Info:
Machine ID: 3c6ff97c6fbe4598b53fd04e08937468
System UUID: C6238BF8-8E60-4331-AEEA-6D0BA9106344
Boot ID: 84397607-908f-4ff8-8bdc-ff86c364dd32
Kernel Version: 3.10.0-514.6.2.el7.x86_64
OS Image: CentOS Linux 7 (Core)
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://1.12.6
Kubelet Version: v1.7.5
Kube-Proxy Version: v1.7.5
PodCIDR: 10.68.0.0/24
ExternalID: k8s-node2
Non-terminated Pods: (5 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
--------- ---- ------------ ---------- --------------- -------------
kube-system etcd-k8s-node2 0 (0%) 0 (0%) 0 (0%) 0 (0%)
kube-system kube-apiserver-k8s-node2 250m (12%) 0 (0%) 0 (0%) 0 (0%)
kube-system kube-controller-manager-k8s-node2 200m (10%) 0 (0%) 0 (0%) 0 (0%)
kube-system kube-proxy-qlmbx 0 (0%) 0 (0%) 0 (0%) 0 (0%)
kube-system kube-scheduler-k8s-node2 100m (5%) 0 (0%) 0 (0%) 0 (0%)
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
550m (27%) 0 (0%) 0 (0%) 0 (0%)
Events: <none>
It's good practice to run a describe command in order to see what's wrong with your node:
kubectl describe nodes <NODE_NAME>
e.g.: kubectl describe nodes k8s-node2
You should be able to start your investigations from there and add more info to this question if needed.
You need install a Network Policy Provider, this is one of supported provider:
Weave Net for NetworkPolicy.
command line to install:
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
After a few seconds, a Weave Net pod should be running on each Node and any further pods you create will be automatically attached to the Weave network.
I think you may need to add tolerations and update the annotations for calico-node in the manifest you are using so that it can run on a master created by kubeadm. Kubeadm taints the master so that pods cannot run on it unless they have a toleration for that taint.
I believe you are using the https://docs.projectcalico.org/v2.5/getting-started/kubernetes/installation/hosted/calico.yaml manifest which has the annotations (that include tolerations) for K8s v1.5, you should check https://docs.projectcalico.org/v2.5/getting-started/kubernetes/installation/hosted/kubeadm/1.6/calico.yaml, it has the toleration syntax for K8s v1.6+.
Here is a snippet from the above with annotations and tolerations
metadata:
labels:
k8s-app: calico-node
annotations:
# Mark this pod as a critical add-on; when enabled, the critical add-on scheduler
# reserves resources for critical add-on pods so that they can be rescheduled after
# a failure. This annotation works in tandem with the toleration below.
scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
hostNetwork: true
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
# Allow this pod to be rescheduled while the node is in "critical add-ons only" mode.
# This, along with the annotation above marks this pod as a critical add-on.
- key: CriticalAddonsOnly
operator: Exists