I'm trying set up custom metrics with a HorizontalPodAutoscaler on a 1.6.1 alpha GKE cluster.
According to https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#prerequisites I need to set --horizontal-pod-autoscaler-use-rest-clients on kube-controller-manager to enable metrics collection. From GKE, it's not clear whether it's possible to set flags on kube-controller-manager. Any ideas?
Has anyone gotten custom metrics working with HPA on GKE?
You can't manipulate any of the kubernetes cluster component directly in GKE(Google Container Engine), Google will do that job, if you want to achieve that you may need to deploy your own kubernetes cluster.
On GKE we have been supporting HPA with custom metrics since version 1.9. If you have a group of horizontally autoscaled pods inside your cluster each exporting a custom metric then you can set an average per pod target for that metric.
An example of that would be an autoscaled deployment of a frontend where each replica exports its current QPS. One could set the average target of QPS per frontend pod and use the HPA to scale the deployment up and down accordingly. You can find the documentation and a tutorial explaining how to set this up here: https://cloud.google.com/kubernetes-engine/docs/tutorials/custom-metrics-autoscaling
Kubernetes 1.10 becoming available on GKE will extend the support for custom metrics to include metrics not attached to any Kubernetes object. This will give you the ability to scale a deployment based on any metric listed here, for example number of messages in Google Pub/Sub queue.
Related
I'm using rancher and the monitoring plugin that installs prometheus. As the cluster grows, prometheus is using more and more CPU and memory to scrape and query data to the point it's the most consuming pod in the cluster.
I noticed the UI shows "prometheis" plural and the workload is a statefulset, but as I understand prometheus doesn't work as a cluster. Can I just scale the set to more pods? What happens? how does it work?
I can't find any information on the documentation.
No, you can't scale Prometheus by adding more pods; one approach (most common) is to set up federation to scale it up.
I have:
deployments of services A and B in k8s
Prometheus stack
I wanna scale service A when metric m1 of service B is changed.
Solutions which I found and not suitable more or less:
I can define HPA for service A with the following part of spec:
- type: Object
object:
metric:
name: m1
describedObject:
apiVersion: api/v1
kind: Pod
name: certain-pod-of-service-B
current:
value: 10k
Technically, it will work. But it's not suitable for dynamic nature of k8s.
Also I can't use pods metric (metrics: - type: Pods pods:) in HPA cause it will request m1 metric for pods of service A (which obviously doesn't have this)
Define custom metric in prometheus-adapter which query m1 metric from pods of service B. It's more suitable, but looks like workaround cause I already have a metric m1
The same for external metrics
I feel that I miss something cause it doesn't seem like a non realistic case :)
So, advise me please how to scale one service by metric of another in k8s?
I decided to provide a Community Wiki answer that may help other people facing a similar issue.
The Horizontal Pod Autoscaler is a Kubernetes feature that allows to scale applications based on one or more monitored metrics.
As we can find in the Horizontal Pod Autoscaler documentation:
The Horizontal Pod Autoscaler automatically scales the number of Pods in a replication controller, deployment, replica set or stateful set based on observed CPU utilization (or, with custom metrics support, on some other application-provided metrics).
There are three groups of metrics that we can use with the Horizontal Pod Autoscaler:
resource metrics: predefined resource usage metrics (CPU and
memory) of pods and nodes.
custom metrics: custom metrics associated with a Kubernetes
object.
external metrics: custom metrics not associated with a
Kubernetes object.
Any HPA target can be scaled based on the resource usage of the pods (or containers) in the scaling target. The CPU utilization metric is a resource metric, you can specify other resource metrics besides CPU (e.g. memory). This seems to be the easiest and most basic method of scaling, but we can use more specific metrics by using custom metrics or external metrics.
There is one major difference between custom metrics and external metrics (see: Custom and external metrics for autoscaling workloads):
Custom metrics and external metrics differ from each other:
A custom metric is reported from your application running in Kubernetes.
An external metric is reported from an application or service not running on your cluster, but whose performance impacts your Kubernetes application.
All in all, in my opinion it is okay to use custom metrics in the case above,
I did not find any other suitable way to accomplish this task.
I am using GKE and have an application-app1(pod) which is exposed using NodePort and then put behind an ingress.
The ingress-controller has launched a GCP load balancer. Now, the requests coming on path /app1/ are routed to my application.
I launched the stackdriver-metrics adapter inside the cluster and then I configured an HPA which uses requests/second metrics from the load balancer. HPA gets the metrics from ExternalMetric for a particular backend name.
- external:
metricName: loadbalancing.googleapis.com|https|request_count
metricSelector:
matchLabels:
resource.labels.backend_target_name: k8s-be-30048--my-backend
targetAverageValue: 20
type: External
Everything works perfectly. Here is the problem,
Some of the other apps which are also running inside the kubernetes cluster are also calling this app1. Those other apps inside the cluster are calling the app1 by the kubernetes FQDN app1.default.svc.cluster.local and not via the load balancer route. That means these requests won't go throught the ingress loadbalancer. That will mean that these requests are not being counted by the HPA in any way.
So, that menans the total requests(Ct) coming are via LoadBalancer(C1) and via FQDN(C2), Ct = C1 + C2. My guess is that hpa will only take C1 into account and not Ct. My hpa will not scale my app accordingly because of the way metrics are being counted here. For example, if Ct is 120 but C1 is 90 then number of pods will be 3 but it should acutally be 4.
Am I wrong here to consider that requests coming via FQDN are not counted by the load balancer?
If the requests are being counted I think I will have to use something which counts requests on the pod level. Something like a prometheus middleware. Can you guys suggest anything else?
Answering following comment:
Yup, that's the obstruction. No way to forecast/relate the kind of traffic. Anyway, how would it help if it could be forecasted?
If it could be forecasted (for example it's always 70%(external)/30%(internal) you could adjust the scaling metric to already include the traffic that the loadbalancer metric isn't aware of.
Instead of collecting metrics from the load balancer itself which will not take into consideration the internal traffic, you can opt to use Custom Metrics (for example: queries per second).
Your application can report a custom metric to Cloud Monitoring. You can configure Kubernetes to respond to these metrics and scale your workload automatically. For example, you can scale your application based on metrics such as queries per second, writes per second, network performance, latency when communicating with a different application, or other metrics that make sense for your workload.
A custom metric can be selected for any of the following:
A particular node, Pod, or any Kubernetes object of any kind, including a CustomResourceDefinition (CRD).
The average value for a metric reported by all Pods in a Deployment
-- Cloud.google.com: Kubernetes Engine: Custom and external metrics: Custom metrics
There is an official documentation about creating Custom Metrics:
Cloud.google.com: Monitoring: Custom metics: Creating metrics
You can also look on already available metrics in the Metrics Explorer.
You can also use multiple metrics when scaling up/down with HPA:
If you configure a workload to autoscale based on multiple metrics, HPA evaluates each metric separately and uses the scaling algorithm to determine the new workload scale based on each one. The largest scale is selected for the autoscale action.
-- Cloud.google.com: Kubernetes Engine: HorizontalPodAutoscaler
As for more of a workaround solution you could also use the CPU usage metric.
Additional resources:
Cloud.google.com: Kubernetes Engine: Tutorials: Autoscaling metrics: Custom metrics
Cloud.google.com: Kubernetes Engine: How to: Horizontal pod autoscaling
I had been trying to implement Kubernetes HPA using Metrics from Kafka-exporter. Hpa supports Prometheus, so we tried writing the metrics to prometheus instance. From there, we are unclear on the steps to do. Is there an article where it will explain in details ?
I followed https://medium.com/google-cloud/kubernetes-hpa-autoscaling-with-kafka-metrics-88a671497f07
for same in GCP and we used stack driver, and the implementation worked like a charm. But, we are struggling in on-premise setup, as stack driver needs to be replaced by Prometheus
In order to scale based on custom metrics, Kubernetes needs to query an API for metrics to check for those metrics. That API needs to implement the custom metrics interface.
So for Prometheus, you need to setup an API that exposes Prometheus metrics through the custom metrics API. Luckily, there already is an adapter.
When I implemented Kubernetes HPA using Metrics from Kafka-exporter I had a few setbacks which I solved doing the following:
I deployed the kafka-exporter container as a sidecar to the pods I
wanted to scale. I found that the HPA scales the pod it gets the
metrics from.
I used annotations to make Prometheus scrape the metrics from the pods with exporter.
Then I verified that the kafka-exporter metrics are getting to Prometheus. If it's not there you can't advance further.
I deployed prometheus adapter using its helm chart. The adapter will "translate" Prometheus's metrics into custom Metrics
Api, which will make it visible to HPA.
I made sure that the metrics are visible in k8s by executing kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1 from one of
the master nodes.
I created an hpa with the matching metric name.
Here is a complete guide explaining how to implement Kubernetes HPA using Metrics from Kafka-exporter
Please comment if you have more questions
I've setup prometheus to collect metrics from my pods and nodes.
I've also setup the prometheus custom metrics adapter.
How can I use those metrics provided by prometheus to autoscale my pods ? I tried to google it but I only find custom pods that provides their metrics on their /metrics url. I would like to be able to autoscale any of my pods that already have a prometheus metric based on the cpu or memory usage.
I can visualize all the metrics in grafana for all my pods and nodes but can't find a way to use it with autoscale
You need to create an HPA (Horizontal Pod Autoscaler)
More info here
This is a good tool showing you how to use an HPA with custom metrics either using a the K8s metrics server or Prometheus.