Kubernetes HPA fails to detect a successfully published custom metric from Stackdriver - kubernetes

I'm trying to scale a Kubernetes Deployment using a HorizontalPodAutoscaler, which listens to a custom metrics through Stackdriver.
I'm having a GKE cluster, with a Stackdriver adapter enabled.
I'm able to publish the custom metric type to Stackdriver, and following is the way it's being displayed in Stackdriver's Metric Explorer.
This is how I have defined my HPA:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: example-hpa
spec:
minReplicas: 1
maxReplicas: 10
metrics:
- type: External
external:
metricName: custom.googleapis.com|worker_pod_metrics|baz
targetValue: 400
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: test-app-group-1-1
After successfully creating example-hpa, executing kubectl get hpa example-hpa, always shows TARGETS as <unknown>, and never detects the value from custom metrics.
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
example-hpa Deployment/test-app-group-1-1 <unknown>/400 1 10 1 18m
I'm using a Java client which runs locally to publish my custom metrics.
I have given the appropriate resource labels as mentioned here (hard coded - so that it can run without a problem in local environment). I have followed this document to create the Java client.
private static MonitoredResource prepareMonitoredResourceDescriptor() {
Map<String, String> resourceLabels = new HashMap<>();
resourceLabels.put("project_id", "<<<my-project-id>>>);
resourceLabels.put("pod_id", "<my pod UID>");
resourceLabels.put("container_name", "");
resourceLabels.put("zone", "asia-southeast1-b");
resourceLabels.put("cluster_name", "my-cluster");
resourceLabels.put("namespace_id", "mynamespace");
resourceLabels.put("instance_id", "");
return MonitoredResource.newBuilder()
.setType("gke_container")
.putAllLabels(resourceLabels)
.build();
}
What am I doing wrong in the above-mentioned steps please? Thank you in advance for any answers provided!
EDIT [RESOLVED]:
I think I have had some misconfigurations, since kubectl describe hpa [NAME] --v=9 showed me some 403 status code, as well as I was using type: External instead of type: Pods (Thanks MWZ for your answer, pointing out this mistake).
I managed to fix it by creating a new project, a new service account, and a new GKE cluster (basically everything from the beginning again). Then I changed my yaml file as follows, exactly as this document explains.
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: test-app-group-1-1
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1beta1
kind: Deployment
name: test-app-group-1-1
minReplicas: 1
maxReplicas: 5
metrics:
- type: Pods # Earlier this was type: External
pods: # Earlier this was external:
metricName: baz # metricName: custom.googleapis.com|worker_pod_metrics|baz
targetAverageValue: 20
I'm now exporting as custom.googleapis.com/baz, and NOT as custom.googleapis.com/worker_pod_metrics/baz. Also, now I'm explicitly specifying the namespace for my HPA in the yaml.

Since you can see your custom metric in Stackdriver GUI I'm guessing metrics are correctly exported. Based on Autoscaling Deployments with Custom Metrics I believe you wrongly defined metric to be used by HPA to scale the deployment.
Please try using this YAML:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: example-hpa
spec:
minReplicas: 1
maxReplicas: 10
metrics:
- type: Pods
pods:
metricName: baz
targetAverageValue: 400
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: test-app-group-1-1
Please have in mind that:
The HPA uses the metrics to compute an average and compare it to the
target average value. In the application-to-Stackdriver export
example, a Deployment contains Pods that export metric. The following
manifest file describes a HorizontalPodAutoscaler object that scales a
Deployment based on the target average value for the metric.
Troubleshooting steps described on the page above can also be useful.
Side-note
Since above HPA is using beta API autoscaling/v2beta1 I got error when running kubectl describe hpa [DEPLOYMENT_NAME]. I ran kubectl describe hpa [DEPLOYMENT_NAME] --v=9 and got response in JSON.

It is a good practice to put some unique labels to target your metrics. Right now, based on metrics labelled in your java client, only pod_id looks unique which can't be used due to its stateless nature.
So, I would suggest you try introducing a deployment/metrics wide unqiue identifier.
resourceLabels.put("<identifier>", "<could-be-deployment-name>");
After this, you can try modifying your HPA with something similar to following:
kind: HorizontalPodAutoscaler
metadata:
name: example-hpa
spec:
minReplicas: 1
maxReplicas: 10
metrics:
- type: External
external:
metricName: custom.googleapis.com|worker_pod_metrics|baz
metricSelector:
matchLabels:
# define labels to target
metric.labels.identifier: <deployment-name>
# scale +1 whenever it crosses multiples of mentioned value
targetAverageValue: "400"
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: test-app-group-1-1
Apart from this, this setup has no issues and should work smooth.
Helper command to see what metrics are exposed to HPA :
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/default/custom.googleapis.com|worker_pod_metrics|baz" | jq

Related

FailedGetPodsMetric: for HPA autoscaling

I am trying to autoscale using custom metrics, with metric type "http_request". My following command is showing correct output:
kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/" | jq
Below is my hpa.yaml file:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: podinfo
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: podinfo
minReplicas: 2
maxReplicas: 10
metrics:
- type: Pods
pods:
metricName: http_requests
targetAverageValue: 1
but my scaling is failing due to
the HPA was unable to compute the replica count:
unable to get metric http_requests: unable to fetch metrics from custom metrics API: an error on the server`
("Internal Server Error: \"/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/%!A(MISSING)/http_requests?labelSelector=app%!D(MISSING)podinfo\": the server could not find the requested resource")
has prevented the request from succeeding (get pods.custom.metrics.k8s.io *)
Please help me out in this :)
Seems like you are missing pods in your cluster that match the provided deployment specification. Can you check if your podinfo deployment is running? And that it has healthy pods in it?
The command works because you're only checking the availability of the metrics endpoint. This simply implies that the endpoint is live to start providing metrics, doesn't guarantee that you will receive metrics (without any resources).

What is the unit of targetAverageValue for external metrics kubernetes.io|container|accelerator|duty_cycle for horizontal pod autoscaling?

I referred this stackoverflow question to set up my HPA(Horizontal Pod Autoscaler) for google kubernetes engine(gke) workload. According to the details of that question and the details specified here I mentioned my targetAverageValue to be 50 which should be considered 50% but when I run the command kubectl describe hpa this is the line I notice in the logs
Metrics: ( current / target ) "kubernetes.io|container|accelerator|duty_cycle" (target average value): 33500m / 50
This is my hpa yaml
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: gpu-metric
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: parabole-dj-u1
minReplicas: 1
maxReplicas: 5
metrics:
- type: External
external:
metricName: kubernetes.io|container|accelerator|duty_cycle
targetAverageValue: 50
It seems to be measuring using some other unit. What then should be my targetAverageValue if I want it to autoscale at 50% duty_cycle?
Adding the screenshot of the duty cycle metric from the portal like #Alberto Pau asked duty_cycle image
Your configuration is correct, HPA always shows in the mili units.
The current utilization is probably 33.5%, just divide the number with the "m" by 1000 and you get the percentages.

Is it possible to set the name of an external metric for a HorizontalPodAutoscaler from a configmap? GKE

I am modifying a deployment which autoscales using a HorizontalPodAutoscaler (HPA). This deployment is part of a pipeline in which workers read messages from pubsub subscriptions, do some work and publish to the next topic. Right now I use a configmap to define the pipeline for the deployments (the configmap contains input subscription and output topics). The HPA autoscales based on the number of messages on the input subscription. I would like to be able to pull the subscription name for the HPA from a configmap if possible? Is there a way to do this?
example HPA:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: my-deployment-hpa
namespace: default
labels:
name: my-deployment-hpa
spec:
minReplicas: 1
maxReplicas: 10
metrics:
- external:
metricName: pubsub.googleapis.com|subscription|num_undelivered_messages
metricSelector:
matchLabels:
resource.labels.subscription_id: "$INPUT_SUBSCRIPTION"
targetAverageValue: "2"
type: External
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-deployment
The value from the HPA currently $INPUT_SUBSCRIPTION could ideally come from a configmap.
Posting this answer as a community wiki for a better visibility as well as the answer was provided in the comments.
Answering the question from the post:
I would like to be able to pull the subscription name for the HPA from a configmap if possible? Is there a way to do this?
As pointed by user #Abdennour TOUMI there is no possibility to set the metric used by HPA with a ConfigMap:
Unfortunately, you cannot.. but you can using prometheus-adapter + HPA . Check this tuto: itnext.io/...
As for a manual workaround you could use a script that will extract needed metric name from the configMap and use a template to replace and apply new HPA.
With a configMap like:
apiVersion: v1
kind: ConfigMap
metadata:
name: example
data:
metric_name: "new_awesome_metric" # <-
not_needed: "only for example"
And following script:
#!/bin/bash
# variables
hpa_file_name="hpa.yaml"
configmap_name="example"
string_to_replace="PLACEHOLDER"
# extract the metric name used in a configmap
new_metric=$(kubectl get configmap $configmap_name -o json | jq '.data.metric_name')
# use the template to replace the $string_to_replace with your $new_metric and apply it
sed "s/$string_to_replace/$new_metric/g" $hpa_file_name | kubectl apply -f -
This script will need to have a hpa.yaml with the template to apply it as resource (example from question could be used with a change:
resource.labels.subscription_id: PLACEHOLDER
For more reference this HPA definition could be based on this guide:
Cloud.google.com: Kubernetes Engine: Tutorials: Autoscaling-metrics: PubSub

How to make HPA scale a deployment based on metrics produced by another deployment

What I am trying to achieve is creating a Horizontal Pod Autoscaler able to scale worker pods according to a custom metric produced by a controller pod.
I already have Prometheus scraping, Prometheus Adapater, Custom Metric Server fully operational and scaling the worker deployment with a custom metric my_controller_metric produced by the worker pods already works.
Now my workerpods don't produce this metric anymore, but the controller does.
It seems that the API autoscaling/v1 does not support this feature. I am able to specify the HPA with the autoscaling/v2beta1 API if necessary though.
Here is my spec for this HPA:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: my-worker-hpa
namespace: work
spec:
maxReplicas: 10
minReplicas: 1
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: Deployment
name: my-worker-deployment
metrics:
- type: Object
object:
target:
kind: Deployment
name: my-controller-deployment
metricName: my_controller_metric
targetValue: 1
When the configuration is applied with kubectl apply -f my-worker-hpa.yml I get the message:
horizontalpodautoscaler "my-worker-hpa" configured
Though this message seems to be OK, the HPA does not work. Is this spec malformed?
As I said, the metric is available in the Custom Metric Server with a kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" | jq . | grep my_controller_metric.
This is the error message from the HPA:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededGetScale the HPA controller was able to get the target's current scale
ScalingActive False FailedGetObjectMetric the HPA was unable to compute the replica count: unable to get metric my_controller_metric: Deployment on work my-controller-deployment/unable to fetch metrics from custom metrics API: the server could not find the metric my_controller_metric for deployments
Thanks!
In your case problem is HPA configuration: spec.metrics.object.target should also specify API version.
Putting apiVersion: extensions/v1beta1 under spec.metrics.object.target should fix it.
In addition, there is an open issue about better config validation in HPA: https://github.com/kubernetes/kubernetes/issues/60511

How to use custom metric with specific filter in Horizontal Pod Autoscaling

I am trying to setup HPA for Ingress Controller based on custom metric nginx_ingress_controller_nginx_process_connections_total.
But while fetching the metrics from localhost:10254/metrics, I could see three such metrics with filter as follows:
# HELP nginx_ingress_controller_nginx_process_connections_total total number of connections with state {active, accepted, handled}
# TYPE nginx_ingress_controller_nginx_process_connections_total counter
nginx_ingress_controller_nginx_process_connections_total{controller_class="nginx",controller_namespace="ingress-nginx",controller_pod="nginx-ingress-controller-7dddd-mssssf",state="accepted"} 479707
nginx_ingress_controller_nginx_process_connections_total{controller_class="nginx",controller_namespace="ingress-nginx",controller_pod="nginx-ingress-controller-7dddd-mssssf",state="active"} 3
nginx_ingress_controller_nginx_process_connections_total{controller_class="nginx",controller_namespace="ingress-nginx",controller_pod="nginx-ingress-controller-7dddd-mssssf",state="handled"} 479707
Out of these metrics, I want to use the below metric for HPA.
nginx_ingress_controller_nginx_process_connections_total{controller_class="nginx",controller_namespace="ingress-nginx",controller_pod="nginx-ingress-controller-7dddd-mssssf",state="active"}
How can I use the specified metric from these different values. My yaml file for HPA is given below.
kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2beta1
metadata:
name: ingress-hpa
spec:
scaleTargetRef:
kind: Deployment
name: nginx-ingress-controller
minReplicas: 3
maxReplicas: 10
metrics:
- type: Pods
pods:
metricName: <I need to set the custom metric here>
targetAverageValue: 10000
You can use HPA custom metrics. You need to expose endpoint in POD to fetch the metrics also setup Prometheus and custom metric api server.