How does the Kubernetes HPA(HorizontalPodAutoscaler) determine which pod's metrics should be used if multiple PODs have the same metrics - kubernetes

suppose we have below HPA(HorizontalPodAutoscaler) deployed in the demo namespace, and multiple pods (POD-A,POD-B) in this demo namespace have the same metric "istio_requests_per_second", How does the HPA determine the metric "istio_requests_per_second" from which pod should be used? Or every POD with this metric will be evaluate against the HPA target?
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: httpbin
spec:
minReplicas: 1
maxReplicas: 5
metrics:
- type: Pods
pods:
metric:
name: istio_requests_per_second
target:
type: AverageValue
averageValue: "10"
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: httpbin
test...

If you're using prometheus then its the adapter thats correlating between k8's pod name and what metric value to return. Basically the HPA is asking the prometheus adapter for metric istio_requests_per_second. By calling /apis/custom.metrics.k8s.io/v1beta1/namespaces/myNamespace/pods/mypod the adapter takes that and looks at its rules configured for what it should query for.
https://github.com/kubernetes-sigs/prometheus-adapter/blob/master/docs/config-walkthrough.md

Based on my test, I think HPA uses the 'scaleTargetRef' to determine which POD's metrics should be used, and pull these metrics from the metrics server and evaluate them against the target config.

As per Kubernetes documentation:
For object metrics and external metrics, a single metric is fetched, which describes the object in question. This metric is compared to the target value, to produce a ratio as above. In the autoscaling/v2 API version, this value can optionally be divided by the number of Pods before the comparison is made.
It will calculate the ratio based on the mean across the target pods.
References:
1.-https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#how-does-a-horizontalpodautoscaler-work

Related

Horizontal Pod Autoscaler (HPA) custom metrics with Prometheus Adapter (How are units defined?)

I'm testing out HPA with custom metrics from application and exposing to K8s using Prometheus-adapter.
My app exposes a "jobs_executing" custom metric that is a numerical valued guage (prometheus-client) in golang exposing number of jobs executed by the app (pod).
Now to cater this in hpa, here is how my HPA configuration looks like:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: myapp
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
minReplicas: 1
maxReplicas: 10
metrics:
- type: Pods
pods:
metric:
name: jobs_executing
target:
type: AverageValue
averageValue: 5
I want autoscaler to scale my pod when the average no. of jobs executed by overall pods equals "5". This works, but sometimes the HPA configuration shows values like this:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
my-autoscaler Deployment/my-scaling-sample-app 7700m/5 1 10 10 38m
here targets show up as "7700m/5" even though the average no. of jobs executed overall were 7.7. This makes HPA just scale horizontally aggressively. I don't understand why it is putting "7700m" in the current target value"?
My question is, if there is a way to define a flaoting point here in HPA that doesn't confuse a normal integer with a 7700m (CPU unit?)
or what am I missing? Thank you
From the docs:
https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/#appendix-quantities
All metrics in the HorizontalPodAutoscaler and metrics APIs are specified using a special whole-number notation known in Kubernetes as a quantity. For example, the quantity 10500m would be written as 10.5 in decimal notation. The metrics APIs will return whole numbers without a suffix when possible, and will generally return quantities in milli-units otherwise. This means you might see your metric value fluctuate between 1 and 1500m, or 1 and 1.5 when written in decimal notation.
So it does not seem like you are able to adjust the unit of measurement that the HPA uses, the generic Quantity.

Kubernetes HPA based on available healthy pods

Is it possible to have the HPA scale based on the number of available running pods?
I have set up a readiness probe that cuts out a pod based it's internal state (idle, working, busy). When a pod is 'busy', it no longer receives new requests. But the cpu, and memory demands are low.
I don't want to scale based on cpu, mem, or other metrics.
Seeing as the readiness probe removes it from active service, can I scale based on the average number of active (not busy) pods? When that number drops below a certain point more pods are scaled.
TIA for any suggestions.
You can create custom metrics, a number of busy-pods for HPA.
That is, the application should emit a metric value when it is busy. And use that metric to create HorizontalPodAutoscaler.
Something like this:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: custom-metric-sd
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1beta1
kind: Deployment
name: custom-metric-sd
minReplicas: 1
maxReplicas: 20
metrics:
- type: Pods
pods:
metricName: busy-pods
targetAverageValue: 4
Here is another reference for HPA with custom metrics.

How to auto-scale Kubernetes Pods based on number of tasks in celery task queue?

I have a celery worker deployed on Kubernetes pods which executes a task (not very CPU intensive but takes some time to complete due to some HTTP calls). Is there any way to autoscale the pods in K8s based on the number of tasks in the task queue?
Yes, by using the Kubernetes metrics registry and Horizontal Pod Autoscaler.
First, you need to collect the "queue length" metric from Celery and expose it through one of the Kubernetes metric APIs. You can do this with a Prometheus-based pipeline:
Since Celery doesn't expose Prometheus metrics, you need to install an exporter that exposes some information about Celery (including the queue length) as Prometheus metrics. For example, this exporter.
Install Prometheus in your cluster and configure it to collect the metric corresponding to the task queue length from the Celery exporter.
Install the Prometheus Adapter in your cluster and configure it to expose the "queue length" metric through the Custom Metrics API by pulling its value from Prometheus.
Now you can configure the Horizontal Pod Autoscaler to query this metric from the Custom Metrics API and autoscale your app based on it.
For example, to scale the app between 1 and 10 replicas based on a target value for the queue length of 5:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: myapp-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
minReplicas: 1
maxReplicas: 10
metrics:
- type: Object
object:
metric:
name: mycelery_queue_length
target:
type: value
value: 5
describedObject:
apiVersion: apps/v1
kind: Deployment
name: mycelery
There is two parts to solve this problem: You need to collect the metrics from celery and make them available to the Kubernetes API (as custom metrics API). Then the HorizontalPodAutoscaler can query those metrics in order to scale based on custom metrics.
You can use Prometheus (for example) to collect metrics from Celery. Then, you can expose the metrics to Kubernetes with the Prometheus Adapter. Now the metrics available in prometheus are available to Kubernetes.
You can now define a HorizontalPodAutoscaler for your application:
kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2alpha1
metadata:
name: sample-metrics-app-hpa
spec:
scaleTargetRef:
kind: Deployment
name: sample-metrics-app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Object
object:
target:
kind: Service
name: sample-metrics-app
metricName: celery_queue_length
targetValue: 100

Kubernetes HPA fails to detect a successfully published custom metric from Stackdriver

I'm trying to scale a Kubernetes Deployment using a HorizontalPodAutoscaler, which listens to a custom metrics through Stackdriver.
I'm having a GKE cluster, with a Stackdriver adapter enabled.
I'm able to publish the custom metric type to Stackdriver, and following is the way it's being displayed in Stackdriver's Metric Explorer.
This is how I have defined my HPA:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: example-hpa
spec:
minReplicas: 1
maxReplicas: 10
metrics:
- type: External
external:
metricName: custom.googleapis.com|worker_pod_metrics|baz
targetValue: 400
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: test-app-group-1-1
After successfully creating example-hpa, executing kubectl get hpa example-hpa, always shows TARGETS as <unknown>, and never detects the value from custom metrics.
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
example-hpa Deployment/test-app-group-1-1 <unknown>/400 1 10 1 18m
I'm using a Java client which runs locally to publish my custom metrics.
I have given the appropriate resource labels as mentioned here (hard coded - so that it can run without a problem in local environment). I have followed this document to create the Java client.
private static MonitoredResource prepareMonitoredResourceDescriptor() {
Map<String, String> resourceLabels = new HashMap<>();
resourceLabels.put("project_id", "<<<my-project-id>>>);
resourceLabels.put("pod_id", "<my pod UID>");
resourceLabels.put("container_name", "");
resourceLabels.put("zone", "asia-southeast1-b");
resourceLabels.put("cluster_name", "my-cluster");
resourceLabels.put("namespace_id", "mynamespace");
resourceLabels.put("instance_id", "");
return MonitoredResource.newBuilder()
.setType("gke_container")
.putAllLabels(resourceLabels)
.build();
}
What am I doing wrong in the above-mentioned steps please? Thank you in advance for any answers provided!
EDIT [RESOLVED]:
I think I have had some misconfigurations, since kubectl describe hpa [NAME] --v=9 showed me some 403 status code, as well as I was using type: External instead of type: Pods (Thanks MWZ for your answer, pointing out this mistake).
I managed to fix it by creating a new project, a new service account, and a new GKE cluster (basically everything from the beginning again). Then I changed my yaml file as follows, exactly as this document explains.
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: test-app-group-1-1
namespace: default
spec:
scaleTargetRef:
apiVersion: apps/v1beta1
kind: Deployment
name: test-app-group-1-1
minReplicas: 1
maxReplicas: 5
metrics:
- type: Pods # Earlier this was type: External
pods: # Earlier this was external:
metricName: baz # metricName: custom.googleapis.com|worker_pod_metrics|baz
targetAverageValue: 20
I'm now exporting as custom.googleapis.com/baz, and NOT as custom.googleapis.com/worker_pod_metrics/baz. Also, now I'm explicitly specifying the namespace for my HPA in the yaml.
Since you can see your custom metric in Stackdriver GUI I'm guessing metrics are correctly exported. Based on Autoscaling Deployments with Custom Metrics I believe you wrongly defined metric to be used by HPA to scale the deployment.
Please try using this YAML:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: example-hpa
spec:
minReplicas: 1
maxReplicas: 10
metrics:
- type: Pods
pods:
metricName: baz
targetAverageValue: 400
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: test-app-group-1-1
Please have in mind that:
The HPA uses the metrics to compute an average and compare it to the
target average value. In the application-to-Stackdriver export
example, a Deployment contains Pods that export metric. The following
manifest file describes a HorizontalPodAutoscaler object that scales a
Deployment based on the target average value for the metric.
Troubleshooting steps described on the page above can also be useful.
Side-note
Since above HPA is using beta API autoscaling/v2beta1 I got error when running kubectl describe hpa [DEPLOYMENT_NAME]. I ran kubectl describe hpa [DEPLOYMENT_NAME] --v=9 and got response in JSON.
It is a good practice to put some unique labels to target your metrics. Right now, based on metrics labelled in your java client, only pod_id looks unique which can't be used due to its stateless nature.
So, I would suggest you try introducing a deployment/metrics wide unqiue identifier.
resourceLabels.put("<identifier>", "<could-be-deployment-name>");
After this, you can try modifying your HPA with something similar to following:
kind: HorizontalPodAutoscaler
metadata:
name: example-hpa
spec:
minReplicas: 1
maxReplicas: 10
metrics:
- type: External
external:
metricName: custom.googleapis.com|worker_pod_metrics|baz
metricSelector:
matchLabels:
# define labels to target
metric.labels.identifier: <deployment-name>
# scale +1 whenever it crosses multiples of mentioned value
targetAverageValue: "400"
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: test-app-group-1-1
Apart from this, this setup has no issues and should work smooth.
Helper command to see what metrics are exposed to HPA :
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/default/custom.googleapis.com|worker_pod_metrics|baz" | jq

How to use custom metric with specific filter in Horizontal Pod Autoscaling

I am trying to setup HPA for Ingress Controller based on custom metric nginx_ingress_controller_nginx_process_connections_total.
But while fetching the metrics from localhost:10254/metrics, I could see three such metrics with filter as follows:
# HELP nginx_ingress_controller_nginx_process_connections_total total number of connections with state {active, accepted, handled}
# TYPE nginx_ingress_controller_nginx_process_connections_total counter
nginx_ingress_controller_nginx_process_connections_total{controller_class="nginx",controller_namespace="ingress-nginx",controller_pod="nginx-ingress-controller-7dddd-mssssf",state="accepted"} 479707
nginx_ingress_controller_nginx_process_connections_total{controller_class="nginx",controller_namespace="ingress-nginx",controller_pod="nginx-ingress-controller-7dddd-mssssf",state="active"} 3
nginx_ingress_controller_nginx_process_connections_total{controller_class="nginx",controller_namespace="ingress-nginx",controller_pod="nginx-ingress-controller-7dddd-mssssf",state="handled"} 479707
Out of these metrics, I want to use the below metric for HPA.
nginx_ingress_controller_nginx_process_connections_total{controller_class="nginx",controller_namespace="ingress-nginx",controller_pod="nginx-ingress-controller-7dddd-mssssf",state="active"}
How can I use the specified metric from these different values. My yaml file for HPA is given below.
kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2beta1
metadata:
name: ingress-hpa
spec:
scaleTargetRef:
kind: Deployment
name: nginx-ingress-controller
minReplicas: 3
maxReplicas: 10
metrics:
- type: Pods
pods:
metricName: <I need to set the custom metric here>
targetAverageValue: 10000
You can use HPA custom metrics. You need to expose endpoint in POD to fetch the metrics also setup Prometheus and custom metric api server.