Unable to retrieve custom metrics from prometheus-adapter - kubernetes

i am trying to experiment with scaling one of my application pods running on my raspberry pi kubernetes cluster using HPA + custom metrics but ran into several issues which despite reading the documentations on https://github.com/DirectXMan12/k8s-prometheus-adapter and troubleshooting for the past 2 days, i am still having difficulties grasping why some problems are happening.
Firstly, i built an ARM-compatible image of k8s-prometheus-adapter and install it using helm. I can confirm its running properly by checking the pod logs.
I have also set up a script which sends raspberry pis temperature to pushgateway and i can query via this Prometheus query node_temp, which will return the following series
node_temp{job="kube4"} 42
node_temp{job="kube1"} 44
node_temp{job="kube2"} 39
node_temp{job="kube3"} 40
Now i want to be able to scale one of my application pods using the above temperature values as an experiment to understand better how it works.
Below is my k8s-prometheus-adapter helm values.yml file
image:
repository: jaanhio/k8s-prometheus-adapter-arm
tag: latest
logLevel: 7
prometheus:
url: http://10.17.0.12
rules:
default: false
custom:
- seriesQuery: 'etcd_object_counts'
resources:
template: <<.Resource>>
name:
as: "etcd_object"
metricsQuery: count(etcd_object_counts)
- seriesQuery: 'node_temp'
resources:
template: <<.Resource>>
name:
as: "node_temp"
metricsQuery: count(node_temp)
After installing via helm, i ran kubectl get apiservices and can see v1beta1.custom.metrics.k8s.io listed.
i then ran kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1 | jq and got the following
{
"kind": "APIResourceList",
"apiVersion": "v1",
"groupVersion": "custom.metrics.k8s.io/v1beta1",
"resources": [
{
"name": "jobs.batch/node_temp",
"singularName": "",
"namespaced": true,
"kind": "MetricValueList",
"verbs": [
"get"
]
},
{
"name": "jobs.batch/etcd_object",
"singularName": "",
"namespaced": true,
"kind": "MetricValueList",
"verbs": [
"get"
]
},
]
i then tried to query the value of the registered node_temp metrics using kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1/jobs/*/node_temp but got the following response
Error from server (InternalError): Internal error occurred: unable to list matching resources
Questions:
Why is the node_temp metrics associated with jobs.batch resource type?
Why am i not able to retrieve the value of metrics via kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1/jobs/*/node_temp?
What is a definitive way of figuring the path of the query? e.g /apis/custom.metrics.k8s.io/v1beta1/jobs/*/node_temp i kinda trial and error until i got see somewhat of a response. i also see some other path with namespaces in the query e.g /apis/custom.metrics.k8s.io/v1beta1/namespaces/*/metrics/foo_metrics
Any help and advice will be greatly appreciate!

Why is the node_temp metrics associated with jobs.batch resource type?
It picks the labels attached to the prometheus metrics and tries to interpret them, in this case u have clearely "job-kube4"
Why am i not able to retrieve the value of metrics via kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1/jobs/*/node_temp?
Metrics are namespaced, see the "namespaced:true" so you'll need "/apis/custom.metrics.k8s.io/v1beta1/namespaces//jobs//node_temp"
What is a definitive way of figuring the path of the query? e.g /apis/custom.metrics.k8s.io/v1beta1/jobs//node_temp i kinda trial and error until i got see somewhat of a response. i also see some other path with namespaces in the query e.g /apis/custom.metrics.k8s.io/v1beta1/namespaces//metrics/foo_metrics
Check https://github.com/kubernetes/community/blob/master/contributors/design-proposals/instrumentation/custom-metrics-api.md#api-paths

Related

KEDA Error: Got empty response for: external.metrics.k8s.io/v1beta1

I am getting below error after installing keda in my k8s cluster and created some scaled object...
whatever command i am running EG: " kubectl get pods" i am getting response with below error message..
How to get rid of below error message.
E0125 11:45:32.766448 316 memcache.go:255] couldn't get resource list for external.metrics.k8s.io/v1beta1: Got empty response for: external.metrics.k8s.io/v1beta1
This error is from client-go when there are no resources available in external.metrics.k8s.io/v1beta1 here in client-go, it gets all ServerGroups.
When KEDA is not installed then external.metrics.k8s.io/v1beta1 is not part of ServerGroups and hence its not called and therefore no issue.
But when KEDA is installed then it creates an ApiService
$ kubectl get apiservice | grep keda-metrics
v1beta1.external.metrics.k8s.io keda/keda-metrics-apiserver True 20m
But it doesn't create any external.metrics.k8s.io resources
$ kubectl get --raw /apis/external.metrics.k8s.io/v1beta1 | jq .
{
"kind": "APIResourceList",
"apiVersion": "v1",
"groupVersion": "external.metrics.k8s.io/v1beta1",
"resources": []
}
Since there are no resources, client-go throws an error.
The workaround is registering a dummy resource in the empty resource group.
Refer to this Github link for more detailed information.

how to display nodes information with a JSON request?

I know how to use the API to perform simple request such as display node information selecting node by labels value.
For example : curl http://localhost:8080/api/v1/nodes?labelSelector=kubernetes.io/role%3Dworker3
Display informations about node whose role is worker3.
Is there a way to perform the same request using a JSON query ?
looked on the web to find a such example but did not find one.
You can query with kubectl by label.
The Roles of the node are just labels.
To return in yaml format
kubectl get nodes -l node-role.kubernetes.io/worker -o yaml
To return in json format
kubectl get nodes -l node-role.kubernetes.io/worker -o json
Update
Querying the api with json you can do like so:
curl http://localhost:8080/api/v1/nodes?{"node.kubernetes.io/worker01":"worker01"}
This in my case returns this:
{
"kind": "NodeList",
"apiVersion": "v1",
"metadata": {
"resourceVersion": "317238"
},
"items": [
{
"metadata": {
"name": "worker01",
"uid": "a2bec224-361f-49e9-8bba-b3b172816d6e",
"resourceVersion": "316653",
"creationTimestamp": "2022-12-24T11:04:43Z",
"labels": {
"beta.kubernetes.io/arch": "amd64",
"beta.kubernetes.io/os": "linux",
"kubernetes.io/arch": "amd64",
"kubernetes.io/hostname": "worker01",
"kubernetes.io/os": "linux",
"microk8s.io/cluster": "true",
"node.kubernetes.io/microk8s-worker": "microk8s-worker"
},
............
As you can see it works, but you must analyse 2 things generally.
the api version (can be different to v1, depends on the kubernetes version)
the labels and property name.
The example above comes from microk8s, here i havent even Roles defined.
kubectl get node
NAME STATUS ROLES AGE VERSION
master Ready <none> 17d v1.25.4
worker01 Ready <none> 17d v1.25.4
So i looked for some label that could extract the required data.

How to pass a flag to klog for structured logging

As part of kubernetes 1.19, structured logging has been implemented.
I've read that kubernetes log's engine is klog and structured logs are following this format :
<klog header> "<message>" <key1>="<value1>" <key2>="<value2>" ...
Cool ! But even better, you apparently can pass a --logging-format=json flag to klog so logs are generated in json directly !
{
"ts": 1580306777.04728,
"v": 4,
"msg": "Pod status updated",
"pod":{
"name": "nginx-1",
"namespace": "default"
},
"status": "ready"
}
Unfortunately, I haven't been able to find out how and where I should specify that --logging-format=json flag.
Is it a kubectl command? I'm using Azure's aks.
--logging-format=json is a flag which need to be set on all Kuberentes System Components ( Kubelet, API-Server, Controller-Manager & Scheduler). You can check all flags here.
Unfortunately you cant do it right now with AKS as you have the managed control plane from Microsoft.

Prometheus Adapter Goroutine issue when fetch the query from external.api

I have implemented and registered external api to the k8s environment but I have some issues when fetch the metrics .
That is rule for external api metrics :
- seriesQuery: 'http_requests_total{namespace!="",pod!=""}'
resources:
overrides:
kubernetes_namespace: {resource: "namespace"}
kubernetes_pod_name: {resource: "pod"}
name:
matches: "^(.*)_total"
as: "${1}"
When check the kubernetes external api metrics , resources values is not null that looks fine but ,
{
"kind": "APIResourceList",
"apiVersion": "v1",
"groupVersion": "external.metrics.k8s.io/v1beta1",
"resources": [
{
"name": "http_requests",
"singularName": "",
"namespaced": true,
"kind": "ExternalMetricValueList",
"verbs": [
"get"
]
}
]
}
After all those things, when I want check the value of http_requests metric kubernetes returning this error ;
Error from server (InternalError): Internal error occurred: unable to fetch metrics
I have checked this issue from prometheus-adapter pod these are error logs related with this query ;
E1005 01:21:26.387384 1 provider.go:47] unable to generate a query for the metric: empty query produced by metrics query template
I1005 01:21:26.387594 1 wrap.go:42] GET /apis/external.metrics.k8s.io/v1beta1/namespaces/default/http_requests: (504.956µs) 500
goroutine 1909 [running]:
github.com/directxman12/k8s-prometheus-adapter/vendor/k8s.io/apiserver/pkg/server/httplog.(*respLogger).recordStatus(0xc420374930, 0x1f4)
/go/src/github.com/directxman12/k8s-prometheus-adapter/vendor/k8s.io/apiserver/pkg/server/httplog/httplog.go:207 +0xd2
github.com/directxman12/k8s-prometheus-adapter/vendor/k8s.io/apiserver/pkg/server/httplog.(*respLogger).WriteHeader(0xc420374930, 0x1f4)
/go/src/github.com/directxman12/k8s-prometheus-adapter/vendor/k8s.io/apiserver/pkg/server/httplog/httplog.go:186 +0x35
github.com/directxman12/k8s-prometheus-adapter/vendor/k8s.io/apiserver/pkg/server/filters.(*baseTimeoutWriter).WriteHeader(0xc42105b2a0, 0x1f4)
/go/src/github.com/directxman12/k8s-prometheus-adapter/vendor/k8s.io/apiserver/pkg/server/filters/timeout.go:192 +0xac
github.com/directxman12/k8s-prometheus-adapter/vendor/k8s.io/apiserver/pkg/endpoints/metrics.(*ResponseWriterDelegator).WriteHeader(0xc42107f920, 0x1f4)
/go/src/github.com/directxman12/k8s-prometheus-adapter/vendor/k8s.io/apiserver/pkg/endpoints/metrics/metrics.go:307 +0x45
github.com/directxman12/k8s-prometheus-adapter/vendor/k8s.io/apiserver/pkg/endpoints/handlers/responsewriters.SerializeObject(0x1791e20, 0x10, 0x7f340078e6c8, 0xc4204ec580, 0x190fd20, 0xc420e8db88, 0xc421081600, 0x1f4, 0x18f9920, 0xc420039320)
/go/src/github.com/directxman12/k8s-prometheus-adapter/vendor/k8s.io/apiserver/pkg/endpoints/handlers/responsewriters/writers.go:95 +0x8d
....
....
....
But I have parsed this error message : unable to generate a query for the metric: empty query produced by metrics query template
What is the reason of this issue ?
Help please
I think that you are receiving this error:
unable to generate a query for the metric: empty query produced by
metrics query template
because your api metric do not contain metricsQuery value.
Second thing is that I am not sure why you are using external.metrics in http_request. In that scenario you can use custom.metrics.
Mentioned metrics is one of the most popular so you can find many tutorials online step-by-step how to achieve this.
For example this or this Github tutorial.

Unable to fully collect metrics, when installing metric-server

I have installed the metric server on kubernetes, but its not working and logs
unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:xxx: unable to fetch metrics from Kubelet ... (X.X): Get https:....: x509: cannot validate certificate for 1x.x.
x509: certificate signed by unknown authority
I was able to get metrics if modified the deployment yaml and added
command:
- /metrics-server
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
this now collects metrics, and kubectl top node returns results...
but logs still show
E1120 11:58:45.624974 1 reststorage.go:144] unable to fetch pod metrics for pod dev/pod-6bffbb9769-6z6qz: no metrics known for pod
E1120 11:58:45.625289 1 reststorage.go:144] unable to fetch pod metrics for pod dev/pod-6bffbb9769-rzvfj: no metrics known for pod
E1120 12:00:06.462505 1 manager.go:102] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:ip-1x.x.x.eu-west-1.compute.internal: unable to get CPU for container ...discarding data: missing cpu usage metric, unable to fully scrape metrics from source
so questions
1) All this works on minikube, but not on my dev cluster, why would that be?
2) In production i dont want to do insecure-tls.. so can someone please explain why this issue is arising... or point me to some resource.
Kubeadm generates the kubelet certificate at /var/lib/kubelet/pki and those certificates (kubelet.crt and kubelet.key) are signed by different CA from the one which is used to generate all other certificates at /etc/kubelet/pki.
You need to regenerate the kubelet certificates which is signed by your root CA (/etc/kubernetes/pki/ca.crt)
You can use openssl or cfssl to generate the new certificates(I am using cfssl)
$ mkdir certs; cd certs
$ cp /etc/kubernetes/pki/ca.crt ca.pem
$ cp /etc/kubernetes/pki/ca.key ca-key.pem
Create a file kubelet-csr.json:
{
"CN": "kubernetes",
"hosts": [
"127.0.0.1",
"<node_name>",
"kubernetes",
"kubernetes.default",
"kubernetes.default.svc",
"kubernetes.default.svc.cluster",
"kubernetes.default.svc.cluster.local"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [{
"C": "US",
"ST": "NY",
"L": "City",
"O": "Org",
"OU": "Unit"
}]
}
Create a ca-config.json file:
{
"signing": {
"default": {
"expiry": "8760h"
},
"profiles": {
"kubernetes": {
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
],
"expiry": "8760h"
}
}
}
}
Now generate the new certificates using above files:
$ cfssl gencert -ca=ca.pem -ca-key=ca-key.pem \
--config=ca-config.json -profile=kubernetes \
kubelet-csr.json | cfssljson -bare kubelet
Replace the old certificates with newly generated one:
$ scp kubelet.pem <nodeip>:/var/lib/kubelet/pki/kubelet.crt
$ scp kubelet-key.pem <nodeip>:/var/lib/kubelet/pki/kubelet.key
Now restart the kubelet so that new certificates will take effect on your node.
$ systemctl restart kubelet
Look at the following tickets to get the context of issue:
https://github.com/kubernetes-incubator/metrics-server/issues/146
Hope this helps.