HPA labelSelector not filtering external metrics - kubernetes

I'm trying to setup custom metrics based autoscaler on a EKS cluster (v1.13.10-eks-5ac0f1) but looks like the labelSelector filter for external metrics labels is not filtering.
Using k8s-prometheus-adapter and metrics-server(v0.3.6) I've managed to export metrics from prometheus as kubernetes external metrics.
The metric is correctly exported and visible on the kubernetes api:
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/*/sqs_queue_messages"
{
"kind": "ExternalMetricValueList",
"apiVersion": "external.metrics.k8s.io/v1beta1",
"metadata": {
"selfLink": "/apis/external.metrics.k8s.io/v1beta1/namespaces/%2A/sqs_queue_messages"
},
"items": [
{
"metricName": "sqs_queue_messages",
"metricLabels": {
"__name__": "sqs_queue_messages",
...
"queue_name": "temp-queue"
},
"timestamp": "2019-11-07T21:14:44Z",
"value": "0"
},
{
"metricName": "sqs_queue_messages",
"metricLabels": {
"__name__": "sqs_queue_messages",
...
"queue_name": "random-queue"
},
"timestamp": "2019-11-07T21:14:44Z",
"value": "0"
}
]
}
horizontal-pod-autoscaler.yml
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: api
namespace: api
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api
minReplicas: 2
maxReplicas: 5
metrics:
- external:
metricName: sqs_queue_messages
metricSelector:
matchLabels:
queue_name: temp-queue
targetAverageValue: "100"
type: External
The problem is that the HPA is not selecting just the metric with the matched label, infact by looking at the logs I can see that the following call is performed
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/*/sqs_queue_messages?labelSelector=queue_name%3Dtemp-queue"
The expected result is just 1 item (the one matching queue_name: temp-queue label) but instead the filter is ignored and all the results are returned.

Related

the server could not find the metric nginx_vts_server_requests_per_second for pods

I installed the kube-prometheus-0.9.0, and want to deploy a sample application on which to test the Prometheus metrics autoscaling, with the following resource manifest file: (hpa-prome-demo.yaml)
apiVersion: apps/v1
kind: Deployment
metadata:
name: hpa-prom-demo
spec:
selector:
matchLabels:
app: nginx-server
template:
metadata:
labels:
app: nginx-server
spec:
containers:
- name: nginx-demo
image: cnych/nginx-vts:v1.0
resources:
limits:
cpu: 50m
requests:
cpu: 50m
ports:
- containerPort: 80
name: http
---
apiVersion: v1
kind: Service
metadata:
name: hpa-prom-demo
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "80"
prometheus.io/path: "/status/format/prometheus"
spec:
ports:
- port: 80
targetPort: 80
name: http
selector:
app: nginx-server
type: NodePort
For testing purposes, used a NodePort Service and luckly I can get the http repsonse after applying the deployment. Then I installed
Prometheus Adapter via Helm Chart by creating a new hpa-prome-adapter-values.yaml file to override the default Values values, as follows.
rules:
default: false
custom:
- seriesQuery: 'nginx_vts_server_requests_total'
resources:
overrides:
kubernetes_namespace:
resource: namespace
kubernetes_pod_name:
resource: pod
name:
matches: "^(.*)_total"
as: "${1}_per_second"
metricsQuery: (sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>))
prometheus:
url: http://prometheus-k8s.monitoring.svc
port: 9090
Added a rules rule and specify the address of Prometheus. Install Prometheus-Adapter with the following command.
$ helm install prometheus-adapter prometheus-community/prometheus-adapter -n monitoring -f hpa-prome-adapter-values.yaml
NAME: prometheus-adapter
LAST DEPLOYED: Fri Jan 28 09:16:06 2022
NAMESPACE: monitoring
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
prometheus-adapter has been deployed.
In a few minutes you should be able to list metrics using the following command(s):
kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1
Finally the adatper was installed successfully, and can get the http response, as follows.
$ kubectl get po -nmonitoring |grep adapter
prometheus-adapter-665dc5f76c-k2lnl 1/1 Running 0 133m
$ kubectl get --raw="/apis/custom.metrics.k8s.io/v1beta1" | jq
{
"kind": "APIResourceList",
"apiVersion": "v1",
"groupVersion": "custom.metrics.k8s.io/v1beta1",
"resources": [
{
"name": "namespaces/nginx_vts_server_requests_per_second",
"singularName": "",
"namespaced": false,
"kind": "MetricValueList",
"verbs": [
"get"
]
}
]
}
But it was supposed to be like this,
$ kubectl get --raw="/apis/custom.metrics.k8s.io/v1beta1" | jq
{
"kind": "APIResourceList",
"apiVersion": "v1",
"groupVersion": "custom.metrics.k8s.io/v1beta1",
"resources": [
{
"name": "namespaces/nginx_vts_server_requests_per_second",
"singularName": "",
"namespaced": false,
"kind": "MetricValueList",
"verbs": [
"get"
]
},
{
"name": "pods/nginx_vts_server_requests_per_second",
"singularName": "",
"namespaced": true,
"kind": "MetricValueList",
"verbs": [
"get"
]
}
]
}
Why I can't get the metrics pods/nginx_vts_server_requests_per_second? as a result, below query was also failed.
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/nginx_vts_server_requests_per_second" | jq .
Error from server (NotFound): the server could not find the metric nginx_vts_server_requests_per_second for pods
Anybody cloud please help? many thanks.
ENV:
helm install all Prometheus charts from prometheus-community https://prometheus-community.github.io/helm-chart
k8s cluster enabled by docker for mac
Solution:
I met the same problem, from Prometheus UI, i found it had namespace label and no pod label in metrics as below.
nginx_vts_server_requests_total{code="1xx", host="*", instance="10.1.0.19:80", job="kubernetes-service-endpoints", namespace="default", node="docker-desktop", service="hpa-prom-demo"}
I thought Prometheus may NOT use pod as a label, so i checked Prometheus config and found:
121 - action: replace
122 source_labels:
123 - __meta_kubernetes_pod_node_name
124 target_label: node
then searched
https://prometheus.io/docs/prometheus/latest/configuration/configuration/ and do the similar thing as below under every __meta_kubernetes_pod_node_name i searched(ie. 2 places)
125 - action: replace
126 source_labels:
127 - __meta_kubernetes_pod_name
128 target_label: pod
after a while, the configmap reloaded, UI and API could find pod label
$ kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1 | jq
{
"kind": "APIResourceList",
"apiVersion": "v1",
"groupVersion": "custom.metrics.k8s.io/v1beta1",
"resources": [
{
"name": "pods/nginx_vts_server_requests_per_second",
"singularName": "",
"namespaced": true,
"kind": "MetricValueList",
"verbs": [
"get"
]
},
{
"name": "namespaces/nginx_vts_server_requests_per_second",
"singularName": "",
"namespaced": false,
"kind": "MetricValueList",
"verbs": [
"get"
]
}
]
}
It is worth knowing that using the kube-prometheus repository, you can also install components such as Prometheus Adapter for Kubernetes Metrics APIs, so there is no need to install it separately with Helm.
I will use your hpa-prome-demo.yaml manifest file to demonstrate how to monitor nginx_vts_server_requests_total metrics.
First of all, we need to install Prometheus and Prometheus Adapter with appropriate configuration as described step by step below.
Copy the kube-prometheus repository and refer to the Kubernetes compatibility matrix in order to choose a compatible branch:
$ git clone https://github.com/prometheus-operator/kube-prometheus.git
$ cd kube-prometheus
$ git checkout release-0.9
Install the jb, jsonnet and gojsontoyaml tools:
$ go install -a github.com/jsonnet-bundler/jsonnet-bundler/cmd/jb#latest
$ go install github.com/google/go-jsonnet/cmd/jsonnet#latest
$ go install github.com/brancz/gojsontoyaml#latest
Uncomment the (import 'kube-prometheus/addons/custom-metrics.libsonnet') + line from the example.jsonnet file:
$ cat example.jsonnet
local kp =
(import 'kube-prometheus/main.libsonnet') +
// Uncomment the following imports to enable its patches
// (import 'kube-prometheus/addons/anti-affinity.libsonnet') +
// (import 'kube-prometheus/addons/managed-cluster.libsonnet') +
// (import 'kube-prometheus/addons/node-ports.libsonnet') +
// (import 'kube-prometheus/addons/static-etcd.libsonnet') +
(import 'kube-prometheus/addons/custom-metrics.libsonnet') + <--- This line
// (import 'kube-prometheus/addons/external-metrics.libsonnet') +
...
Add the following rule to the ./jsonnet/kube-prometheus/addons/custom-metrics.libsonnet file in the rules+ section:
{
seriesQuery: "nginx_vts_server_requests_total",
resources: {
overrides: {
namespace: { resource: 'namespace' },
pod: { resource: 'pod' },
},
},
name: { "matches": "^(.*)_total", "as": "${1}_per_second" },
metricsQuery: "(sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>))",
},
After this update, the ./jsonnet/kube-prometheus/addons/custom-metrics.libsonnet file should look like this:
NOTE: This is not the entire file, just an important part of it.
$ cat custom-metrics.libsonnet
// Custom metrics API allows the HPA v2 to scale based on arbirary metrics.
// For more details on usage visit https://github.com/DirectXMan12/k8s-prometheus-adapter#quick-links
{
values+:: {
prometheusAdapter+: {
namespace: $.values.common.namespace,
// Rules for custom-metrics
config+:: {
rules+: [
{
seriesQuery: "nginx_vts_server_requests_total",
resources: {
overrides: {
namespace: { resource: 'namespace' },
pod: { resource: 'pod' },
},
},
name: { "matches": "^(.*)_total", "as": "${1}_per_second" },
metricsQuery: "(sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>))",
},
...
Use the jsonnet-bundler update functionality to update the kube-prometheus dependency:
$ jb update
Compile the manifests:
$ ./build.sh example.jsonnet
Now simply use kubectl to install Prometheus and other components as per your configuration:
$ kubectl apply --server-side -f manifests/setup
$ kubectl apply -f manifests/
After configuring Prometheus, we can deploy a sample hpa-prom-demo Deployment:
NOTE: I've deleted the annotations because I'm going to use a ServiceMonitor to describe the set of targets to be monitored by Prometheus.
$ cat hpa-prome-demo.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: hpa-prom-demo
spec:
selector:
matchLabels:
app: nginx-server
template:
metadata:
labels:
app: nginx-server
spec:
containers:
- name: nginx-demo
image: cnych/nginx-vts:v1.0
resources:
limits:
cpu: 50m
requests:
cpu: 50m
ports:
- containerPort: 80
name: http
---
apiVersion: v1
kind: Service
metadata:
name: hpa-prom-demo
labels:
app: nginx-server
spec:
ports:
- port: 80
targetPort: 80
name: http
selector:
app: nginx-server
type: LoadBalancer
Next, create a ServiceMonitor that describes how to monitor our NGINX:
$ cat servicemonitor.yaml
kind: ServiceMonitor
apiVersion: monitoring.coreos.com/v1
metadata:
name: hpa-prom-demo
labels:
app: nginx-server
spec:
selector:
matchLabels:
app: nginx-server
endpoints:
- interval: 15s
path: "/status/format/prometheus"
port: http
After waiting some time, let's check the hpa-prom-demo logs to make sure that it is scrapped correctly:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
hpa-prom-demo-bbb6c65bb-49jsh 1/1 Running 0 35m
$ kubectl logs -f hpa-prom-demo-bbb6c65bb-49jsh
...
10.4.0.9 - - [04/Feb/2022:09:29:17 +0000] "GET /status/format/prometheus HTTP/1.1" 200 3771 "-" "Prometheus/2.29.1" "-"
10.4.0.9 - - [04/Feb/2022:09:29:32 +0000] "GET /status/format/prometheus HTTP/1.1" 200 3771 "-" "Prometheus/2.29.1" "-"
10.4.0.9 - - [04/Feb/2022:09:29:47 +0000] "GET /status/format/prometheus HTTP/1.1" 200 3773 "-" "Prometheus/2.29.1" "-"
10.4.0.9 - - [04/Feb/2022:09:30:02 +0000] "GET /status/format/prometheus HTTP/1.1" 200 3773 "-" "Prometheus/2.29.1" "-"
10.4.0.9 - - [04/Feb/2022:09:30:17 +0000] "GET /status/format/prometheus HTTP/1.1" 200 3773 "-" "Prometheus/2.29.1" "-"
10.4.2.12 - - [04/Feb/2022:09:30:23 +0000] "GET /status/format/prometheus HTTP/1.1" 200 3773 "-" "Prometheus/2.29.1" "-"
...
Finally, we can check if our metrics work as expected:
$ kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/" | jq . | grep -A 7 "nginx_vts_server_requests_per_second"
"name": "pods/nginx_vts_server_requests_per_second",
"singularName": "",
"namespaced": true,
"kind": "MetricValueList",
"verbs": [
"get"
]
},
--
"name": "namespaces/nginx_vts_server_requests_per_second",
"singularName": "",
"namespaced": false,
"kind": "MetricValueList",
"verbs": [
"get"
]
},
$ kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/nginx_vts_server_requests_per_second" | jq .
{
"kind": "MetricValueList",
"apiVersion": "custom.metrics.k8s.io/v1beta1",
"metadata": {
"selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/%2A/nginx_vts_server_requests_per_second"
},
"items": [
{
"describedObject": {
"kind": "Pod",
"namespace": "default",
"name": "hpa-prom-demo-bbb6c65bb-49jsh",
"apiVersion": "/v1"
},
"metricName": "nginx_vts_server_requests_per_second",
"timestamp": "2022-02-04T09:32:59Z",
"value": "533m",
"selector": null
}
]
}

I have an RBAC problem, but everything I test seems ok?

This is a continuation of the problem described here (How do I fix a role-based problem when my role appears to have the correct permissions?)
I have done much more testing and still do not understand the error
Error from server (Forbidden): pods is forbidden: User "dma" cannot list resource "pods" in API group "" at the cluster scope
UPDATE: Here is another hint from the API server
watch chan error: etcdserver: mvcc: required revision has been compacted
I found this thread, but I am working in the current kubernetes
How fix this error "watch chan error: etcdserver: mvcc: required revision has been compacted"?
My user exists
NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION
dma 77m kubernetes.io/kube-apiserver-client kubernetes-admin <none> Approved,Issued
The clusterrole exists
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"rbac.authorization.k8s.io/v1","kind":"ClusterRole","metadata":{"annotations":{},"name":"kubelet-runtime"},"rules":[{"apiGroups":["","extensions","apps","argoproj.io","workflows.argoproj.io","events.argoproj.io","coordination.k8s.io"],"resources":["*"],"verbs":["*"]},{"apiGroups":["batch"],"resources":["jobs","cronjobs"],"verbs":["*"]}]}
creationTimestamp: "2021-12-16T00:24:56Z"
name: kubelet-runtime
resourceVersion: "296716"
uid: a4697d6e-c786-4ec9-bf3e-88e3dbfdb6d9
rules:
- apiGroups:
- ""
- extensions
- apps
- argoproj.io
- workflows.argoproj.io
- events.argoproj.io
- coordination.k8s.io
resources:
- '*'
verbs:
- '*'
- apiGroups:
- batch
resources:
- jobs
- cronjobs
verbs:
- '*'
The sandbox namespace exists
NAME STATUS AGE
sandbox Active 6d6h
My user has authority to operate in the kubelet cluster and the namespace "sandbox"
{
"apiVersion": "rbac.authorization.k8s.io/v1",
"kind": "ClusterRoleBinding",
"metadata": {
"annotations": {
"kubectl.kubernetes.io/last-applied-configuration": "{\"apiVersion\":\"rbac.authorization.k8s.io/v1\",\"kind\":\"ClusterRoleBinding\",\"metadata\":{\"annotations\":{},\"name\":\"dma-kubelet-binding\"},\"roleRef\":{\"apiGroup\":\"rbac.authorization.k8s.io\",\"kind\":\"ClusterRole\",\"name\":\"kubelet-runtime\"},\"subjects\":[{\"kind\":\"ServiceAccount\",\"name\":\"dma\",\"namespace\":\"argo\"},{\"kind\":\"ServiceAccount\",\"name\":\"dma\",\"namespace\":\"argo-events\"},{\"kind\":\"ServiceAccount\",\"name\":\"dma\",\"namespace\":\"sandbox\"}]}\n"
},
"creationTimestamp": "2021-12-16T00:25:42Z",
"name": "dma-kubelet-binding",
"resourceVersion": "371397",
"uid": "a2fb6d5b-8dba-4320-af74-71caac7bdc39"
},
"roleRef": {
"apiGroup": "rbac.authorization.k8s.io",
"kind": "ClusterRole",
"name": "kubelet-runtime"
},
"subjects": [
{
"kind": "ServiceAccount",
"name": "dma",
"namespace": "argo"
},
{
"kind": "ServiceAccount",
"name": "dma",
"namespace": "argo-events"
},
{
"kind": "ServiceAccount",
"name": "dma",
"namespace": "sandbox"
}
]
}
My user has the correct permissions
{
"apiVersion": "rbac.authorization.k8s.io/v1",
"kind": "Role",
"metadata": {
"annotations": {
"kubectl.kubernetes.io/last-applied-configuration": "{\"apiVersion\":\"rbac.authorization.k8s.io/v1\",\"kind\":\"Role\",\"metadata\":{\"annotations\":{},\"name\":\"dma\",\"namespace\":\"sandbox\"},\"rules\":[{\"apiGroups\":[\"\",\"apps\",\"autoscaling\",\"batch\",\"extensions\",\"policy\",\"rbac.authorization.k8s.io\",\"argoproj.io\",\"workflows.argoproj.io\"],\"resources\":[\"pods\",\"configmaps\",\"deployments\",\"events\",\"pods\",\"persistentvolumes\",\"persistentvolumeclaims\",\"services\",\"workflows\"],\"verbs\":[\"get\",\"list\",\"watch\",\"create\",\"update\",\"patch\",\"delete\"]}]}\n"
},
"creationTimestamp": "2021-12-21T19:41:38Z",
"name": "dma",
"namespace": "sandbox",
"resourceVersion": "1058387",
"uid": "94191881-895d-4457-9764-5db9b54cdb3f"
},
"rules": [
{
"apiGroups": [
"",
"apps",
"autoscaling",
"batch",
"extensions",
"policy",
"rbac.authorization.k8s.io",
"argoproj.io",
"workflows.argoproj.io"
],
"resources": [
"pods",
"configmaps",
"deployments",
"events",
"pods",
"persistentvolumes",
"persistentvolumeclaims",
"services",
"workflows"
],
"verbs": [
"get",
"list",
"watch",
"create",
"update",
"patch",
"delete"
]
}
]
}
My user is configured correctly on all nodes
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: DATA+OMITTED
server: https://206.81.25.186:6443
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: dma
name: dma#kubernetes
- context:
cluster: kubernetes
user: kubernetes-admin
name: kubernetes-admin#kubernetes
current-context: kubernetes-admin#kubernetes
kind: Config
preferences: {}
users:
- name: dma
user:
client-certificate-data: REDACTED
client-key-data: REDACTED
- name: kubernetes-admin
user:
client-certificate-data: REDACTED
client-key-data: REDACTED
Based on this website, I have been searching for a watch event.
I think have rebuilt everything above the control plane but the problem persists.
The next step would be to rebuild the entire cluster, but it would be so much more satisfying to find the actual problem.
Please help.
FIX:
So the policy for the sandbox namespace was wrong. I fixed that and the problem is gone!
I think finally understand RBAC (policies and all). Thank you very much to members of the Kubernetes slack channel. These policies have passed the first set of tests for a development environment ("sandbox") for Argo workflows. Still testing.
policies.yaml file:
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: dev
namespace: sandbox
rules:
- apiGroups:
- "*"
attributeRestrictions: null
resources: ["*"]
verbs:
- get
- watch
- list
- apiGroups: ["argoproj.io", "workflows.argoproj.io", "events.argoprpj.io"]
attributeRestrictions: null
resources:
- pods
- configmaps
- deployments
- events
- pods
- persistentvolumes
- persistentvolumeclaims
- services
- workflows
- eventbus
- eventsource
- sensor
verbs: ["*"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: dma-dev
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: dev
subjects:
- kind: User
name: dma
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: dma-admin
subjects:
- kind: User
name: dma
namespace: sandbox
roleRef:
kind: ClusterRole
name: cluster-admin
apiGroup: rbac.authorization.k8s.io
---
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: access-nginx
namespace: sandbox
spec:
podSelector:
matchLabels:
app: nginx
ingress:
- from:
- podSelector:
matchLabels:
run: access
...

What configuration can be done on prometheus adapter in order to get the sum of cpu_usage_seconds_total accross all replicas of a container?

I have a Kubernetes cluster and Prometheus/Prometheus adapter installed.
This is the prometheus adapter configuration rules:
rules:
custom:
- seriesQuery: '{__name__=~"container_cpu_usage_seconds_total"}'
resources:
overrides:
template: "<<.Resource>>"
# namespace:
# resource: namespace
# pod:
# resource: pod
name:
matches: "container_cpu_usage_seconds_total"
as: "my_custom_metric"
metricsQuery: sum(<<.Series>>{container="php-apache"}) by (<<.GroupBy>>)
And this is my hpa configuration:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: php-apache
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: php-apache
minReplicas: 1
maxReplicas: 6
metrics:
- type: Pods
pods:
metric:
name: my_custom_metric
target:
type: Value
averageValue: 250 //limit
---
apiVersion: v1
kind: Service
metadata:
name: php-apache
labels:
run: php-apache
spec:
ports:
- port: 80
selector:
run: php-apache
The problem here is that I want to scale based on the summary of the replicas that container=php-apache use and not with the Average Value of them.
This is the value that is returned from the Prometheus Adapter:
{
"kind": "MetricValueList",
"apiVersion": "custom.metrics.k8s.io/v1beta1",
"metadata": {
"selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/%2A/malakas"
},
"items": [
{
"describedObject": {
"kind": "Pod",
"namespace": "default",
"name": "php-apache-d4cf67d68-8ddbx",
"apiVersion": "/v1"
},
"metricName": "my_custom_metric",
"timestamp": "2021-04-16T10:52:02Z",
"value": "331827m",
"selector": null
},
{
"describedObject": {
"kind": "Pod",
"namespace": "default",
"name": "php-apache-d4cf67d68-zxkrd",
"apiVersion": "/v1"
},
"metricName": "my_custom_metric",
"timestamp": "2021-04-16T10:52:02Z",
"value": "44478m",
"selector": null
}
]
}
In this example, there are 2 replicas.
I want to get one result (the sum of these two) and not two results just like above in order to pass the result to hpa and scale accordingly.
How can I achieve that?
You should use metrics from the service not from the pod:
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/services/*/my_custom_metric"

How to check external metrics data in Kubernetes?

I am using DirectXMan12/k8s-prometheus-adapte to push the external metric from Prometheus to Kubernetes.
After pushing the external metric how can I verify the data is k8s?
When I hit kubectl get --raw /apis/external.metrics.k8s.io/v1beta1 | jq I got the following result but after that, I do not have an idea how to fetch actual metrics value
{
"kind": "APIResourceList",
"apiVersion": "v1",
"groupVersion": "external.metrics.k8s.io/v1beta1",
"resources": [
{
"name": "subscription_back_log",
"singularName": "",
"namespaced": true,
"kind": "ExternalMetricValueList",
"verbs": [
"get"
]
}]
}
actual metric value is fetched per instance, for example, the metric you attached is namespaced: true, assuming the metric is for pods, you can access the actual data at
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/wanted_namepsace/pods/*/subscription_back_log" | jq '.'
(or specify the pod name instead of *)
If you want HPA to read you metric, the configurations are (for example)
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: your-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: your-pod
minReplicas: 1
maxReplicas: 10
metrics:
- pods:
metricName: subscription_back_log
targetAverageValue: 10000
type: Pods
The metric is namespaced, so you will need to add the namespace into the URL. Contrary to what the other answer suggests, I believe you don't need to include pods into the URL. This is an external metric. External metrics are not associated to any kubernetes object, so only the namespace should suffice:
/apis/external.metrics.k8s.io/v1beta1/namespaces/<namespace>/<metric_name>
Here's an example that works for me, using an external metric in my setup:
$ kubectl get --raw /apis/external.metrics.k8s.io/v1beta1 | jq
{
"kind": "APIResourceList",
"apiVersion": "v1",
"groupVersion": "external.metrics.k8s.io/v1beta1",
"resources": [
{
"name": "redis_key_size",
"singularName": "",
"namespaced": true,
"kind": "ExternalMetricValueList",
"verbs": [
"get"
]
}
]
}
$ kubectl get --raw /apis/external.metrics.k8s.io/v1beta1/namespaces/default/redis_key_size
{
"kind": "ExternalMetricValueList",
"apiVersion": "external.metrics.k8s.io/v1beta1",
"metadata": {},
"items": [
{
"metricName": "redis_key_size",
"metricLabels": {
"key": "..."
},
"timestamp": "2021-10-07T09:00:01Z",
"value": "0"
},
...
]
}

Prometheus Adapter empty custom metric items

I'm attempting to auto-scale a Kubernetes deployment with an HPA using Prometheus custom metrics with the Prometheus Adapter. These custom metrics are published to Prometheus via another deployment in another namespace which every minute queries a REST API for a particular metric and then publish the value of that metric to Prometheus. From there the adapter should be able to query Prometheus for said metric, with some additional labels as query criteria, and publish that metric with a new name. From there the HPA should be able to pick up this metric and scale based on its value.
Here the labels for my deployment which the adapter supposedly bases its matching off of:
Labels: app.kubernetes.io/instance=event-subscription-dev-dev
app.kubernetes.io/managed-by=Tiller-dev
app.kubernetes.io/name=event-subscription-dev
deployment-name=event-subscription-webhook-worker-dev
helm.sh/chart=event-subscription-0.1.0-dev
Here are the Prometheus Adapter Helm chart values/adapter rules:
logLevel: 1
metricsRelistInterval: 5s
prometheus:
url: 'http://<prometheus-url>'
rules:
custom:
- seriesQuery: '{__name__="event_subscription_current_message_lag"}'
name:
matches: "(.*)"
as: '${1}_webhooks'
resources:
overrides:
namespace: {resource: "namespace"}
pod: {resource: "pod"}
metricsQuery: 'sum(event_subscription_current_message_lag{queue="webhooks", container_name!="POD"})'
- seriesQuery: '{__name__="event_subscription_current_message_lag"}'
name:
matches: "(.*)"
as: '${1}_webhook_retries'
resources:
overrides:
namespace: {resource: "namespace"}
pod: {resource: "pod"}
metricsQuery: 'sum(event_subscription_current_message_lag{queue="webhook_retries", container_name!="POD"})'
And here is the metrics piece of my HPA spec:
metrics:
- type: Pods
pods:
metric:
name: event_subscription_current_message_lag_webhooks
target:
type: AverageValue
averageValue: 10
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 85
The problem I'm having here is not with the adapter querying for the metric and then publishing a new metric, but rather that the new metric has no value associated with it as the original metric does.
For example if I run kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1 I do see my event_subscription_current_message_lag_webhooks and event_subscription_current_message_lag_webhook_retries metrics, but they don't have any value like the original event_subscription_current_message_lag metric does.
Here's output from kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/event-subscription/pods/*/event_subscription_current_message_lag"
{
"kind": "MetricValueList",
"apiVersion": "custom.metrics.k8s.io/v1beta1",
"metadata": {
"selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/event-subscription/pods/%2A/event_subscription_current_message_lag"
},
"items": [
{
"describedObject": {
"kind": "Pod",
"namespace": "event-subscription",
"name": "activemq-message-lag-retrieval-7bfc46b948-jr8kp",
"apiVersion": "/v1"
},
"metricName": "event_subscription_current_message_lag",
"timestamp": "2019-11-08T22:09:53Z",
"value": "1"
}
]
}
And here's the output for event_subscription_current_message_lag_webhooks and event_subscription_current_message_lag_webhook_retries:
{
"kind": "MetricValueList",
"apiVersion": "custom.metrics.k8s.io/v1beta1",
"metadata": {
"selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/event-subscription/pods/%2A/event_subscription_current_message_lag_webhooks"
},
"items": []
}
...
{
"kind": "MetricValueList",
"apiVersion": "custom.metrics.k8s.io/v1beta1",
"metadata": {
"selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/event-subscription/pods/%2A/event_subscription_current_message_lag_webhook_retries"
},
"items": []
}
I'm confused as to how the adapter is able to, seemingly, find my original metric, query for it, publish the new metric, but without the value, I would expect which in this case is 1.