Kubernetes HPA not scaling with custom metric using prometheus adapter on istio - kubernetes

I have two deployments running a v1 and v2 of the same service in istio. I have set up a custom-metric 'istio-total-requests' to be collected through prometheus adapater.
I have set up an HPA to scale the v2 deployment and can see the target value increasing when I send requests, but what is not happening is the HPA is not scaling the number of pods/replicas.
I am running kubernetes v1.19 on minikube v1.13.1, and cannot understand why its not scaling.
prometheus:
url: http://prometheus.istio-system.svc.cluster.local
rules:
default: false
custom:
# this rule matches cumulative cAdvisor metrics measured in seconds
- seriesQuery: 'istio_requests_total{kubernetes_namespace!="",kubernetes_pod_name!=""}'
seriesFilters: []
resources:
# template: <<.Resource>>
# skip specifying generic resource<->label mappings, and just
# attach only pod and namespace resources by mapping label names to group-resources
overrides:
kubernetes_namespace: {resource: "namespace"}
kubernetes_pod_name: {resource: "pod"}
# specify that the `container_` and `_seconds_total` suffixes should be removed.
# this also introduces an implicit filter on metric family names
name:
# we use the value of the capture group implicitly as the API name
# we could also explicitly write `as: "$1"`
matches: "^(.*)_total"
as: "${1}_per_second"
# matches: ""
# as: ""
# specify how to construct a query to fetch samples for a given series
# This is a Go template where the `.Series` and `.LabelMatchers` string values
# are available, and the delimiters are `<<` and `>>` to avoid conflicts with
# the prometheus query language
metricsQuery: "sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)"
HPA YAML
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: translate-deployment-v2-hpa
spec:
minReplicas: 1
maxReplicas: 10
metrics:
- type: Pods
pods:
metric:
name: istio_requests_per_second
# selector: {matchLabels: {destination_version: 0.0.2}}
target:
type: AverageValue
averageValue: 10
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: translate-deployment-v2
See the HPA pulling and measuring the metric but not scaling, the window below it shows the prometheus-adapter getting queried for the metric successfully.
HPA Description
One final item which I'm not clear on, is what is the purpose of the selector in the hpa definition above ? Is it to select specific values from the data range which is queried by prometheus ?
For example I know the field I'm querying is output by default by envoy as follows:
istio_requests_total{app="istio-ingressgateway",chart="gateways",connection_security_policy="unknown",destination_app="translate-pod",destination_canonical_revision="0.0.1",destination_canonical_service="translate-pod",destination_principal="spiffe://cluster.local/ns/default/sa/default",destination_service="translate-service.default.svc.cluster.local",destination_service_name="translate-service",destination_service_namespace="default",destination_version="0.0.1",destination_workload="translate-deployment",destination_workload_namespace="default",heritage="Tiller",install_operator_istio_io_owning_resource="unknown",instance="172.17.0.5:15020",istio="ingressgateway",istio_io_rev="default",job="kubernetes-pods",kubernetes_namespace="istio-system",kubernetes_pod_name="istio-ingressgateway-6cfd75fc57-flmhp",operator_istio_io_component="IngressGateways",pod_template_hash="6cfd75fc57",release="istio",reporter="source",request_protocol="http",response_code="200",response_flags="-",service_istio_io_canonical_name="istio-ingressgateway",service_istio_io_canonical_revision="latest",source_app="istio-ingressgateway",source_canonical_revision="latest",source_canonical_service="istio-ingressgateway",source_principal="spiffe://cluster.local/ns/istio-system/sa/istio-ingressgateway-service-account",source_version="unknown",source_workload="istio-ingressgateway",source_workload_namespace="istio-system"}
Does the selector allow you to further filter the series data, if not what is the purpose and how do you use it ?

According to your screenshot, HPA works as expected because your metric value is lower than the threshold. If the value does not go over your threshold value, HPA will not trigger the scale up. Instead, it may trigger the scale down in your case.
The metric you are using right now is istio_requests_per_second. That is calculated by the total request per second. The first screenshot show the average value is 200m, which will be 0.2. Your threshold is 10 so HPA definitely would not scale up in this case.
For the selector, it gives you the ability to select your target label under the metrics. For example, if you want to scale the instances against the GET method only. You can define the selector matching the GET method label in this case.

Related

How does the Kubernetes HPA(HorizontalPodAutoscaler) determine which pod's metrics should be used if multiple PODs have the same metrics

suppose we have below HPA(HorizontalPodAutoscaler) deployed in the demo namespace, and multiple pods (POD-A,POD-B) in this demo namespace have the same metric "istio_requests_per_second", How does the HPA determine the metric "istio_requests_per_second" from which pod should be used? Or every POD with this metric will be evaluate against the HPA target?
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: httpbin
spec:
minReplicas: 1
maxReplicas: 5
metrics:
- type: Pods
pods:
metric:
name: istio_requests_per_second
target:
type: AverageValue
averageValue: "10"
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: httpbin
test...
If you're using prometheus then its the adapter thats correlating between k8's pod name and what metric value to return. Basically the HPA is asking the prometheus adapter for metric istio_requests_per_second. By calling /apis/custom.metrics.k8s.io/v1beta1/namespaces/myNamespace/pods/mypod the adapter takes that and looks at its rules configured for what it should query for.
https://github.com/kubernetes-sigs/prometheus-adapter/blob/master/docs/config-walkthrough.md
Based on my test, I think HPA uses the 'scaleTargetRef' to determine which POD's metrics should be used, and pull these metrics from the metrics server and evaluate them against the target config.
As per Kubernetes documentation:
For object metrics and external metrics, a single metric is fetched, which describes the object in question. This metric is compared to the target value, to produce a ratio as above. In the autoscaling/v2 API version, this value can optionally be divided by the number of Pods before the comparison is made.
It will calculate the ratio based on the mean across the target pods.
References:
1.-https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#how-does-a-horizontalpodautoscaler-work

HorizontalPodAutoscaler can't get external metric with resource instance_id label selector

I am working on HPA that should scale pods based on external metrics received from redis.googleapis.com, I am also filtering metrics based on labels like metric.labels.direction , resource.labels.region ... (check matchLabels in HPA yaml code):
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: redis-consumer
spec:
minReplicas: 1
maxReplicas: 10
metrics:
- external:
metric:
name: redis.googleapis.com|stats|network_traffic
selector:
matchLabels:
metric.labels.direction: out
metric.labels.role: primary
resource.labels.region: REGION_ID
resource.labels.project_id: PROJECT_ID
resource.labels.instance_id: INSTANCE_ID #Problematic filter
target:
type: AverageValue
averageValue: 300
type: External
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: redis-consumer
This works well until I add filter for resource.labels.instance_id , after that my HPA stops working with status message "unable to get external metric":
- lastTransitionTime: "2022-10-26T10:57:13Z"
message: 'the HPA was unable to compute the replica count: unable to get external
metric default/redis.googleapis.com|stats|network_traffic/&LabelSelector{MatchLabels:map[string]string{metric.labels.direction:
out,metric.labels.role: primary,resource.labels.instance_id: INSTANCE_ID,resource.labels.project_id:
PROJECT_ID,resource.labels.region: REGION_ID,},MatchExpressions:[]LabelSelectorRequirement{},}:
no metrics returned from external metrics API'
reason: FailedGetExternalMetric
status: "False"
type: ScalingActive
When go to Monitoring > Metrics explorer on google cloud console and use same metrics I get nice graph of the values that I need.
There is one key difference, in metrics explorer instance_id* value is in format:
projects/PROJECT_ID/locations/REGIONID/instances/INSTANCE_ID
but in HPA yaml file instance_id value is just ***INSTANCE_ID **because in HPA yaml configuration I can only use alphanumerical string with _, - and . characters that is less than 63 characters long so using full instance_id path like one I use Metrics explorer in is impossible for yaml file.
How can I use instance_id label filter with HPA, should I use full path or just instance name for the value of instance_id, and if full value must be used how can it be done if I can't use slashes and string that is more than 63 characters long?
I have tried using instance_id value is in format: projects/PROJECT_ID/locations/REGIONID/instances/INSTANCE_ID
And replacing slashes with dots but that also didn't work because string must be shorter than 63 characters

Horizontal Pod Autoscaler (HPA) custom metrics with Prometheus Adapter (How are units defined?)

I'm testing out HPA with custom metrics from application and exposing to K8s using Prometheus-adapter.
My app exposes a "jobs_executing" custom metric that is a numerical valued guage (prometheus-client) in golang exposing number of jobs executed by the app (pod).
Now to cater this in hpa, here is how my HPA configuration looks like:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: myapp
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
minReplicas: 1
maxReplicas: 10
metrics:
- type: Pods
pods:
metric:
name: jobs_executing
target:
type: AverageValue
averageValue: 5
I want autoscaler to scale my pod when the average no. of jobs executed by overall pods equals "5". This works, but sometimes the HPA configuration shows values like this:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
my-autoscaler Deployment/my-scaling-sample-app 7700m/5 1 10 10 38m
here targets show up as "7700m/5" even though the average no. of jobs executed overall were 7.7. This makes HPA just scale horizontally aggressively. I don't understand why it is putting "7700m" in the current target value"?
My question is, if there is a way to define a flaoting point here in HPA that doesn't confuse a normal integer with a 7700m (CPU unit?)
or what am I missing? Thank you
From the docs:
https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/#appendix-quantities
All metrics in the HorizontalPodAutoscaler and metrics APIs are specified using a special whole-number notation known in Kubernetes as a quantity. For example, the quantity 10500m would be written as 10.5 in decimal notation. The metrics APIs will return whole numbers without a suffix when possible, and will generally return quantities in milli-units otherwise. This means you might see your metric value fluctuate between 1 and 1500m, or 1 and 1.5 when written in decimal notation.
So it does not seem like you are able to adjust the unit of measurement that the HPA uses, the generic Quantity.

kubernetes Autoscaler - Cannot obtain loadbalancing.googleapis.com|https|request_count

I'm trying to define an Horizontal Pod Autoscaler for two Kubernetes services.
The Autoscaler strategy relies in 3 metrics:
cpu
pubsub.googleapis.com|subscription|num_undelivered_messages
loadbalancing.googleapis.com|https|request_count
CPU and num_undelivered_messages are correctly obtained, but no matter what i do, i cannot get the request_count metric.
The first service is a backend service (Service A), and the other (Service B) is an API that uses an Ingress to manage the external access to the service.
The Autoscaling strategy is based on Google documentation: Autoscaling Deployments with External Metrics.
For service A, the following defines the metrics used for Autoscaling:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: ServiceA
spec:
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: Deployment
name: ServiceA
minReplicas: 1
maxReplicas: 3
metrics:
- type: Resource
resource:
name: cpu
targetAverageUtilization: 80
- external:
metricName: pubsub.googleapis.com|subscription|num_undelivered_messages
metricSelector:
matchLabels:
resource.labels.subscription_id: subscription_id
targetAverageValue: 100
type: External
For service B, the following defines the metrics used for Autoscaling:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: ServiceB
spec:
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: Deployment
name: ServiceB
minReplicas: 1
maxReplicas: 3
metrics:
- type: Resource
resource:
name: cpu
targetAverageUtilization: 80
- external:
metricName: loadbalancing.googleapis.com|https|request_count
metricSelector:
matchLabels:
resource.labels.forwarding_rule_name: k8s-fws-default-serviceb--3a908157de956ba7
targetAverageValue: 100
type: External
As defined in the above article, the metrics server is running, and the metrics server adapter is deployed:
$ kubectl get apiservices |egrep metrics
v1beta1.custom.metrics.k8s.io custom-metrics/custom-metrics-stackdriver-adapter True 2h
v1beta1.external.metrics.k8s.io custom-metrics/custom-metrics-stackdriver-adapter True 2h
v1beta1.metrics.k8s.io kube-system/metrics-server True 2h
v1beta2.custom.metrics.k8s.io custom-metrics/custom-metrics-stackdriver-adapter True 2h
For service A, all metrics, CPU and num_undelivered_messages, are correctly obtained:
$ kubectl get hpa ServiceA
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
ServiceA Deployment/ServiceA 0/100 (avg), 1%/80% 1 3 1 127m
For service B, HPA cannot obtain the Request Count:
$ kubectl get hpa ServiceB
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
ServiceB Deployment/ServiceB <unknown>/100 (avg), <unknown>/80% 1 3 1 129m
When accessing the Ingress, i get this warning:
unable to get external metric default/loadbalancing.googleapis.com|https|request_count/&LabelSelector{MatchLabels:map[string]string{resource.labels.forwarding_rule_name: k8s-fws-default-serviceb--3a908157de956ba7,},MatchExpressions:[],}: no metrics returned from external metrics API
The metricSelector for the forwarding-rule is correct, as confirmed when describing the ingress (only the relevant information is show):
$ kubectl describe ingress serviceb
Annotations:
ingress.kubernetes.io/https-forwarding-rule: k8s-fws-default-serviceb--3a908157de956ba7
I've tried to use a different metric selector, for example, using url_map_name, to no avail, i've got a similar error.
I've followed the exact guidelines on Google Documentation, and checked with a few online tutorials that refer the exact same process, but i haven't been able to understand what i'm missing.
I'm probably lacking some configuration, or some specific detail, but i cannot find it documented anywhere.
What am i missing, that explains why i'm not being able to obtain the loadbalancing.googleapis.com|https|request_count metric?
It seems the metric that you're defining isn't available in the External Metrics API. To find out what's going on, you can inspect the External Metrics API directly:
kubectl get --raw="/apis/external.metrics.k8s.io/v1beta1" | jq
Is the loadbalancing.googleapis.com|https|request_count metric reported in the output?
You can then dig deeper by making requests of the following form:
kubectl get --raw="/apis/external.metrics.k8s.io/v1beta1/namespaces/<namespace_name>/<metric_name>?labelSelector=<selector>" | jq
And see what's returned given your metric name and a specific metric selector.
These are precisely the requests that the Horizontal Pod Autoscaler also makes at runtime. By replicating them manually, you should be able to pinpoint the source of the problem.
Comments about additional information:
1) 83m is the Kubernetes way of writing 0.083 (read as 83 "milli-units").
2) In your HorizontalPodAutoscaler definition, you use a targetAverageValue. So, if there exist multiple targets with this metric, the HPA calculates their average. So, 83m might be an average of multiple targets. To make sure, you use only the metric of a single target, you can use the targetValue field (see API reference).
3) Not sure why the items: [] array in the API response is empty. The documentation mentions that after sampling, the data is not visible for 210 seconds... You could try making the API request when the HPA is not running.
Thank you very much for your detailed response.
When using the metricSelector to select the specific forwarding_rule_name, we need to use the exact forwarding_rule_name as defined by the ingress:
metricSelector:
matchLabels:
resource.labels.forwarding_rule_name: k8s-fws-default-serviceb--3a908157de956ba7
$ kubectl describe ingress
Name: serviceb
...
Annotations:
ingress.kubernetes.io/https-forwarding-rule: k8s-fws-default-serviceb--9bfb478c0886702d
...
kubernetes.io/ingress.allow-http: false
kubernetes.io/ingress.global-static-ip-name: static-ip
The problem, is that the suffix of the forwarding_rule_name (3a908157de956ba7) changes for every deployment, and is created dynamically on Ingress creation:
k8s-fws-default-serviceb--3a908157de956ba7
We have a fully automated deployment using Helm, and, as such, when the HPA is created, we don't know what the forwarding_rule_name will be.
And, it seems that the matchLabels does not accept regular expressions, or else we would simply do something like:
metricSelector:
matchLabels:
resource.labels.forwarding_rule_name: k8s-fws-default-serviceb--*
I've tried several approaches, all without success:
Use Annotations to force the forwarding_rule_name
Use a different machLabel, as backend_target_name
Obtain the forwarding_rule_name using a command, so i can insert it later in the yaml file.
Use Annotations to force the forwarding_rule_name:
When creating the ingress, i can use specific annotations to change the default behavior, or define specific values, for example, on Ingress.yaml:
annotations:
kubernetes.io/ingress.global-static-ip-name: static-ip
I tried to use the https-forwarding-rule annotation to force a specific "static" name, but this didn't work:
annotations:
ingress.kubernetes.io/https-forwarding-rule: some_name
annotations:
kubernetes.io/https-forwarding-rule: some_name
Use a different machLabel, as backend_target_name
metricSelector:
matchLabels:
resource.labels.backend_target_name: serviceb
Also failed.
Obtain the forwarding_rule_name using a command
When executing the following command, i get the list of Forwarding Rules, but for all the clusters. And according to the documentation, is not possible to filter by cluster:
gcloud compute forwarding-rules list
NAME P_ADDRESS IP_PROTOCOL TARGET
k8s-fws-default-serviceb--4e1c268b39df8462 xx TCP k8s-tps-default-serviceb--4e1c268b39df8462
k8s-fws-default-serviceb--9bfb478c0886702d xx TCP k8s-tps-default-serviceb--9bfb478c0886702d
Is there any way to allow me to select the resource i need, in order to get the Requests count metric?
It seems everything was OK with my code, but, there is a time delay (aprox. 10m), before the request_count metric is available. After this period, the metric is now computed and available:
$ kubectl get hpa ServiceB
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
ServiceB Deployment/ServiceB 83m/100 (avg), 1%/80% 1 3 1 18m
Now, regarding the loadbalancing.googleapis.com|https|request_count metric, i'm not understanding how its being presented. What does 83m means?
According to Google documentation for Load balancing metrics:
https/request_bytes_count Request bytes
DELTA, INT64, By
GA
The number of requests served by HTTP/S load balancer. Sampled every 60
seconds. After sampling, data is not visible for up to 210 seconds.
According to Metric Details:
In a DELTA metric, each data point represents the change in a value
over the time interval. For example, the number of service requests
received since the previous measurement would be a delta metric.
I've made one single request to the service, so i was expecting a value of 1, and i can't understand what the 83m means.
Another possibility, could be that i'm not using the correct metric.
I've selected the loadbalancing.googleapis.com|https|request_count metric, assuming it would provide the number of requests that were executed by the service, via the loadbalancer.
Isn't exactly this information that the loadbalancing.googleapis.com|https|request_count metric provides?
Regarding the above comment, when executing:
kubectl get --raw="/apis/external.metrics.k8s.io/v1beta1/namespaces/default/pubsub.googleapis.com|subscription|num_undelivered_messages" | jq
i get the correct data:
...
{
"metricName": "pubsub.googleapis.com|subscription|num_undelivered_messages",
"metricLabels": {
"resource.labels.project_id": "project-id",
"resource.labels.subscription_id": "subscription_id",
"resource.type": "pubsub_subscription"
},
"timestamp": "2019-10-22T15:39:58Z",
"value": "4"
}
...
but, when executing:
kubectl get --raw="/apis/external.metrics.k8s.io/v1beta1/namespaces/default/loadbalancing.googleapis.com|https|request_count" | jq
i get nothing back:
{ "kind": "ExternalMetricValueList", "apiVersion":
"external.metrics.k8s.io/v1beta1", "metadata": {
"selfLink": >"/apis/external.metrics.k8s.io/v1beta1/namespaces/default/loadbalancing.googleapis.com%7Chttps%7Crequest_count"
}, "items": [] }

How to use custom metric with specific filter in Horizontal Pod Autoscaling

I am trying to setup HPA for Ingress Controller based on custom metric nginx_ingress_controller_nginx_process_connections_total.
But while fetching the metrics from localhost:10254/metrics, I could see three such metrics with filter as follows:
# HELP nginx_ingress_controller_nginx_process_connections_total total number of connections with state {active, accepted, handled}
# TYPE nginx_ingress_controller_nginx_process_connections_total counter
nginx_ingress_controller_nginx_process_connections_total{controller_class="nginx",controller_namespace="ingress-nginx",controller_pod="nginx-ingress-controller-7dddd-mssssf",state="accepted"} 479707
nginx_ingress_controller_nginx_process_connections_total{controller_class="nginx",controller_namespace="ingress-nginx",controller_pod="nginx-ingress-controller-7dddd-mssssf",state="active"} 3
nginx_ingress_controller_nginx_process_connections_total{controller_class="nginx",controller_namespace="ingress-nginx",controller_pod="nginx-ingress-controller-7dddd-mssssf",state="handled"} 479707
Out of these metrics, I want to use the below metric for HPA.
nginx_ingress_controller_nginx_process_connections_total{controller_class="nginx",controller_namespace="ingress-nginx",controller_pod="nginx-ingress-controller-7dddd-mssssf",state="active"}
How can I use the specified metric from these different values. My yaml file for HPA is given below.
kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2beta1
metadata:
name: ingress-hpa
spec:
scaleTargetRef:
kind: Deployment
name: nginx-ingress-controller
minReplicas: 3
maxReplicas: 10
metrics:
- type: Pods
pods:
metricName: <I need to set the custom metric here>
targetAverageValue: 10000
You can use HPA custom metrics. You need to expose endpoint in POD to fetch the metrics also setup Prometheus and custom metric api server.