Disabling heapster health checks in Kubernetes 1.10 - kubernetes

While doing kubectl cluster-info dump , I see alot of:
2018/10/18 14:47:47 Metric client health check failed: the server could not find the requested resource (get services heapster). Retrying in 30 seconds.
2018/10/18 14:48:17 Metric client health check failed: the server could not find the requested resource (get services heapster). Retrying in 30 seconds.
2018/10/18 14:48:47 Metric client health check failed: the server could not find the requested resource (get services heapster). Retrying in 30 seconds.
2018/10/18 14:49:17 Metric client health check failed: the server could not find the requested resource (get services heapster). Retrying in 30 seconds.
2018/10/18 14:49:47 Metric client health check failed: the server could not find the requested resource (get services heapster). Retrying in 30 seconds.
2018/10/18 14:50:17 Metric client health check failed: the server could not find the requested resource (get services heapster). Retrying in 30 seconds.
2018/10/18 14:50:47 Metric client health check failed: the server could not find the requested resource (get services heapster). Retrying in 30 seconds.
Maybe this is a bug that will be fixed in new version ( heapster is deprecated anyway in new versions) , but is there anyway to disable these checks to avoid these noisy messges.

You can find Heapster deprecation timeline here.
I found that in Kubernetes cluster 1.10 version kubernetes-dashboard Pod produces such kind of error messages:
kubectl --namespace=kube-system log <kubernetes-dashboard-Pod>
2018/10/22 13:04:36 Metric client health check failed: the server
could not find the requested resource (get services heapster).
Retrying in 30 seconds.
It seems that kubernetes-dashboard still requires Heapster service for metrics and graph purposes.

Related

k3s - Metrics server doesn't work for worker nodes

I deployed a k3s cluster into 2 raspberry pi 4. One as a master and the second as a worker using the script k3s offered with the following options:
For the master node:
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC='server --bind-address 192.168.1.113 (which is the master node ip)' sh -
To the agent node:
curl -sfL https://get.k3s.io | \
K3S_URL=https://192.168.1.113:6443 \
K3S_TOKEN=<master-token> \
INSTALL_K3S_EXEC='agent' sh-
Everything seems to work, but kubectl top nodes returns the following:
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
k3s-master 137m 3% 1285Mi 33%
k3s-node-01 <unknown> <unknown> <unknown> <unknown>
I also tried to deploy the k8s dashboard, according to what is written in the docs but it fails to work because it can't reach the metrics server and gets a timeout error:
"error trying to reach service: dial tcp 10.42.1.11:8443: i/o timeout"
and I see a lot of errors in the pod logs:
2021/09/17 09:24:06 Metric client health check failed: the server is currently unable to handle the request (get services dashboard-metrics-scraper). Retrying in 30 seconds.
2021/09/17 09:25:06 Metric client health check failed: the server is currently unable to handle the request (get services dashboard-metrics-scraper). Retrying in 30 seconds.
2021/09/17 09:26:06 Metric client health check failed: the server is currently unable to handle the request (get services dashboard-metrics-scraper). Retrying in 30 seconds.
2021/09/17 09:27:06 Metric client health check failed: the server is currently unable to handle the request (get services dashboard-metrics-scraper). Retrying in 30 seconds.
logs from the metrics-server pod:
elet_summary:k3s-node-01: unable to fetch metrics from Kubelet k3s-node-01 (k3s-node-01): Get https://k3s-node-01:10250/stats/summary?only_cpu_and_memory=true: dial tcp 192.168.1.106:10250: connect: no route to host
E0917 14:03:24.767949 1 manager.go:111] unable to fully collect metrics: unable to fully scrape metrics from source kubelet_summary:k3s-node-01: unable to fetch metrics from Kubelet k3s-node-01 (k3s-node-01): Get https://k3s-node-01:10250/stats/summary?only_cpu_and_memory=true: dial tcp 192.168.1.106:10250: connect: no route to host
E0917 14:04:24.767960 1 manager.go:111] unable to fully collect metrics: unable to fully scrape metrics from source kubelet_summary:k3s-node-01: unable to fetch metrics from Kubelet k3s-node-01 (k3s-node-01): Get https://k3s-node-01:10250/stats/summary?only_cpu_and_memory=true: dial tcp 192.168.1.106:10250: connect: no route to host
Moving this out of comments for better visibility.
After creation of small cluster, I wasn't able to reproduce this behaviour and metrics-server worked fine for both nodes, kubectl top nodes showed information and metrics about both available nodes (thought it took some time to start collecting the metrics).
Which leads to troubleshooting steps why it doesn't work. Checking metrics-server logs is the most efficient way to figure this out:
$ kubectl logs metrics-server-58b44df574-2n9dn -n kube-system
Based on logs it will be different steps to continue, for instance in comments above:
first it was no route to host which is related to network and lack of possibility to resolve hostname
then i/o timeout which means route exists, but service did not respond back. This may happen due to firewall which blocks certain ports/sources, kubelet is not running (listens to port 10250) or as it appeared for OP, there was an issue with ntp which affected certificates and connections.
errors may be different in other cases, it's important to find the error and based on it troubleshoot further.

k8s dashboard: Metric client health check failed

I install the k8s dashboard use the following command:
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.4/aio/deploy/recommended.yaml
then I watch the log of dashboard pod:
$ kubectl -n kubernetes-dashboard logs -f kubernetes-dashboard-665f4c5ff-wcrj9
2020/09/12 04:19:10 Metric client health check failed: an error on the server ("unknown") has prevented the request from succeeding (get services dashboard-metrics-scraper). Retrying in 30 seconds.
2020/09/12 04:19:43 Metric client health check failed: an error on the server ("unknown") has prevented the request from succeeding (get services dashboard-metrics-scraper). Retrying in 30 seconds.
2020/09/12 04:20:17 Metric client health check failed: an error on the server ("unknown") has prevented the request from succeeding (get services dashboard-metrics-scraper). Retrying in 30 seconds.
2020/09/12 04:20:50 Metric client health check failed: an error on the server ("unknown") has prevented the request from succeeding (get services dashboard-metrics-scraper). Retrying in 30 seconds.
2020/09/12 04:21:23 Metric client health check failed: an error on the server ("unknown") has prevented the request from succeeding (get services dashboard-metrics-scraper). Retrying in 30 seconds.
2020/09/12 04:21:56 Metric client health check failed: an error on the server ("unknown") has prevented the request from succeeding (get services dashboard-metrics-scraper). Retrying in 30 seconds.
2020/09/12 04:22:29 Metric client health check failed: an error on the server ("unknown") has prevented the request from succeeding (get services dashboard-metrics-scraper). Retrying in 30 seconds.
kubeadm version: 1.19
kubectl version: 1.19
Can anyone help me?
To give a bit of background information: once you install the Kubernetes Dashboard you install a Pod that provides the Dashboard as well as a Pod that is in charge of scraping Metrics from the Kubernetes Metrics API, the Dashboard Metrics Scraper. The dashboard delegates to the scraper, expecting to address it via its K8s Service: "dashboard-metrics-scraper".
In your case, this service can't be found. Do a "kubectl get service -n kubernetes-dashboard" to see whether the scraper service was deleted or renamed. If it was deleted, reapply the Dashboard installation yamls to recreate it.
I was unable to replicate your issue but here are some steps you can try to debug the problem:
Metric client health check failed: ... Retrying in 30 seconds error appears only one time in the dashboard's source code, when Health check fails.
HealthCheck itself is a proxy request to api-server.
Use following command to test if proxy is working correctly.
$ kubectl get --raw "/api/v1/namespaces/kubernetes-dashboard/services/dashboard-metrics-scraper/proxy/healthz"
it should return: URL: /healthz. If didn't, there is most probably sth wrong with the dashboard-metrics-scraper service or the pod. Make sure that service exists and the pod is running and ready.
If it's working for you (from cli), but it is still not working for kubernetes-dashboard, this mean that you should check kubernetes-dashboard's RBAC permissions. Make sure that kubernetes-dashboard has permissions to proxy.
The second error you are seeing:
{"level":"error","msg":"Error scraping node metrics: the server could not find the requested resource (get nodes.metrics.k8s.io)","time":"2020-09-13T02:52:38Z"}
indicates that you don't have a metrics server deployed in your cluster. Check metrics-server github repo for more information.
I'm on kubernetes 1.20.1-00 ubuntu 20.04. I got the
{"level":"error","msg":"Error scraping node metrics: the server could not find the requested resource (get nodes.metrics.k8s.io)","time":"2020-09-13T02:52:38Z"}
error because I deployed kubernetes dashboard with metric scraper prior to deploying metric server. After a day of running in that configuration I was still getting the "Error scraping node..." in my metric scraper pod logs.
I resolved it by scaling the the metric scraper deployment to 0 (zero) and then scaling it back to the desired no of pods (in my case 3).
The error message in the logs went away immediately once the metric scraper pods had spun up.
I'm not implying that this is the correct fix just an observation from seeing an identical error. It could caused by simply deploying metric server and Kubernetes dashboard in the wrong order as I did.

'kubectl top pods' Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)

When I am trying to run kubectl top nodes I`m getting the output:
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)
Metric server is able to scrape the metrics, in the logs getting the metrics
ScrapeMetrics: time: 49.754499ms, nodes: 4, pods: 82
...Storing metrics...
...Cycle complete...
But the end points for the metrics service are missing, how can i resolve this issue?
kubectl get apiservices |egrep metrics
v1beta1.metrics.k8s.io kube-system/metrics-server False (MissingEndpoints)
any help will be appriciated!

Kubernetes metrics-server unable to fully scrape metrics

When I try to use kubectl top nodes I get this error:
Error from server (NotFound): the server could not find the requested resource (get services http:heapster:)
But heapster is deprecated and I'm using kubernetes 1.11. I installed metrics-server and I still get the same error, when I try to check metrics-server's logs I see this error:
E1019 12:33:55.621691 1 manager.go:102] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:elegant-ardinghelli-ei3: unable to fetch metrics from Kubelet elegant-ardinghelli-ei3 (elegant-ardinghelli-ei3):
Get https://elegant-ardinghelli-ei3:10250/stats/summary/: dial tcp: lookup elegant-ardinghelli-ei3 on 10.245.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:elegant-ardinghelli-aab: unable to fetch metrics from Kubelet elegant-ardinghelli-aab (elegant-ardinghelli-aab):
Get https://elegant-ardinghelli-aab:10250/stats/summary/: dial tcp: lookup elegant-ardinghelli-aab on 10.245.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:elegant-ardinghelli-e4z: unable to fetch metrics from Kubelet elegant-ardinghelli-e4z (elegant-ardinghelli-e4z):
Get https://elegant-ardinghelli-e4z:10250/stats/summary/: dial tcp: lookup elegant-ardinghelli-e4z on 10.245.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:elegant-ardinghelli-e41: unable to fetch metrics from Kubelet elegant-ardinghelli-e41 (elegant-ardinghelli-e41):
Get https://elegant-ardinghelli-e41:10250/stats/summary/: dial tcp: lookup elegant-ardinghelli-e41 on 10.245.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:elegant-ardinghelli-ein: unable to fetch metrics from Kubelet elegant-ardinghelli-ein (elegant-ardinghelli-ein):
Get https://elegant-ardinghelli-ein:10250/stats/summary/: dial tcp: lookup elegant-ardinghelli-ein on 10.245.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:elegant-ardinghelli-aar: unable to fetch metrics from Kubelet elegant-ardinghelli-aar (elegant-ardinghelli-aar):
Get https://elegant-ardinghelli-aar:10250/stats/summary/: dial tcp: lookup elegant-ardinghelli-aar on 10.245.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:elegant-ardinghelli-aaj: unable to fetch metrics from Kubelet elegant-ardinghelli-aaj (elegant-ardinghelli-aaj):
Get https://elegant-ardinghelli-aaj:10250/stats/summary/: dial tcp: lookup elegant-ardinghelli-aaj on 10.245.0.10:53: no such host, unable to fully scrape metrics from source kubelet_summary:elegant-ardinghelli-e49: unable to fetch metrics from Kubelet elegant-ardinghelli-e49 (elegant-ardinghelli-e49):
Get https://elegant-ardinghelli-e49:10250/stats/summary/: dial tcp: lookup elegant-ardinghelli-e49 on 10.245.0.10:53: no such host]
It is reported here.
Github Issues:
This PR implements support for the kubectl top commands to use the
metrics-server as an aggregated API, instead of requesting the metrics
from heapster directly. If the metrics.k8s.io API is not served by the
apiserver, then this still falls back to the previous behavior.
Merged in https://github.com/kubernetes/kubernetes/pull/56206
Maybe fixed in 1.12 or scheduled for next version.

Configure kubernetes-dashboard to use metrics-server service instated of heapster.

I have installed kube v1.11, since heapster is depreciated I am using matrics-server. Kubectl top node command works.
Kubernetes dashboard looking for heapster service. What is the steps to configure dashboard to use materics server services
2018/08/09 21:13:43 Metric client health check failed: the server could not find the requested resource (get services heapster). Retrying in 30 seconds.
Thanks
SR
This must be the week for asking that question; it seems that whatever is deploying heapster is omitting the Service, which one can fix as described here -- or the tl;dr is just: create the Service named heapster and point it at your heapster pods.
As of today Kubernetes dashboard doesn't support matrics-server and it is expected to be released very soon with new release of kubernetes dashboard.
You can follow https://github.com/kubernetes/dashboard/issues/2986