Horizontal Pod Autoscaling using REST API exposed by the application in container - kubernetes

I am using minikube on Windows, there is only one node "master".
The spring boot application deployed has REST endpoint which gives the number of client its currently serving. I would like to scale out horizontally or auto spin a pod when the requests reaches some limit.
Lets say:
There is 1 pod in the cluster.
If the request limit reached 50 (for Pod 1), spin up a new pod.
If the request limit reached 50 for Pod 1 and Pod 2, spin up a new Pod (Pod 3).
I tried researching on how to achieve this, I was not able to figure out any.
All I could find was scaling out using CPU usage with HorizontalPodAutoscaler(HPA).
Would be helpful to receive a guidance on how to achieve this using Kubernetes HPA.

I believe you can start from the autoscaling on custom metrics article. As per I see - this is achievable using the custom metrics in conjunction with Prometheus Adapter for Kubernetes Metrics APIs (An implementation of the custom.metrics.k8s.io API using Prometheus).
Prometheus Adapter for Kubernetes Metrics APIs repo contains an implementation of the Kubernetes resource metrics API and custom metrics API.
This adapter is therefore suitable for use with the autoscaling/v2 Horizontal Pod Autoscaler in Kubernetes 1.6+.
Info from autoscaling on custom metrics:
Notice that you can specify other resource metrics besides CPU. By
default, the only other supported resource metric is memory. These
resources do not change names from cluster to cluster, and should
always be available, as long as the metrics.k8s.io API is available.
The first of these alternative metric types is pod metrics. These metrics describe pods, and are averaged together across pods and compared with a target value to determine the replica count. They work much like resource metrics, except that they only support a target type of AverageValue.
Pod metrics are specified using a metric block like this:
type: Pods
pods:
metric:
name: packets-per-second
target:
type: AverageValue
averageValue: 1k
The second alternative metric type is object metrics. These metrics describe a different object in the same namespace, instead of describing pods. The metrics are not necessarily fetched from the object; they only describe it. Object metrics support target types of both Value and AverageValue. With Value, the target is compared directly to the returned metric from the API. With AverageValue, the value returned from the custom metrics API is divided by the number of pods before being compared to the target. The following example is the YAML representation of the requests-per-second metric.
type: Object
object:
metric:
name: requests-per-second
describedObject:
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
name: main-route
target:
type: Value
value: 2k
Also maybe below will be helpful for your future investigations:
Autoscaling on more specific metrics
Autoscaling on metrics not related to Kubernetes objects
Hope it helps

Related

Empty metricLabels map for Kubernetes external metrics

I am currently trying to set up an horizontal pod autoscaler for my application running inside Kubernetes. The HPA is relying on external metrics that are fetched from Prometheus by a Prometheus adapter (https://github.com/kubernetes-sigs/prometheus-adapter).
The metrics are fetched by the adapter and made available to the Kubernetes metrics API successfully, but the metricLabels map is empty, making it impossible for the HPA to associate the correct metrics with the correct pod.
Eg. of a query to the metrics API
kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/<namespace>/batchCommandsActive_totalCount/"
{"kind":"ExternalMetricValueList","apiVersion":"external.metrics.k8s.io/v1beta1","metadata":{},"items":[{"metricName":"batchCommandsActive_totalCount",**"metricLabels":{}**,"timestamp":"2023-02-10T11:38:48Z","value":"0"}]}
Those metrics should have three labels associated to them (hostname, localnode and path) in order for the correct pod to retrieve them.
Here is an extract of the Prometheus adapter configmap that defines the queries made to Prometheus by the Prometheus adapter
- seriesQuery: '{__name__="batchCommandsActive_totalCount",hostname!="",localnode!="",path!=""}'
metricsQuery: sum(<<.Series>>{<<.LabelMatchers>>}) by (name)
resources:
namespaced: false
Thanks for your help!
So far, no answer from StackOverflow or tutorial (eg. https://github.com/kubernetes-sigs/prometheus-adapter/blob/master/docs/walkthrough.md) have helped with my problem.

Kubernetes: using HPA with metrics from other pods

I have:
deployments of services A and B in k8s
Prometheus stack
I wanna scale service A when metric m1 of service B is changed.
Solutions which I found and not suitable more or less:
I can define HPA for service A with the following part of spec:
- type: Object
object:
metric:
name: m1
describedObject:
apiVersion: api/v1
kind: Pod
name: certain-pod-of-service-B
current:
value: 10k
Technically, it will work. But it's not suitable for dynamic nature of k8s.
Also I can't use pods metric (metrics: - type: Pods pods:) in HPA cause it will request m1 metric for pods of service A (which obviously doesn't have this)
Define custom metric in prometheus-adapter which query m1 metric from pods of service B. It's more suitable, but looks like workaround cause I already have a metric m1
The same for external metrics
I feel that I miss something cause it doesn't seem like a non realistic case :)
So, advise me please how to scale one service by metric of another in k8s?
I decided to provide a Community Wiki answer that may help other people facing a similar issue.
The Horizontal Pod Autoscaler is a Kubernetes feature that allows to scale applications based on one or more monitored metrics.
As we can find in the Horizontal Pod Autoscaler documentation:
The Horizontal Pod Autoscaler automatically scales the number of Pods in a replication controller, deployment, replica set or stateful set based on observed CPU utilization (or, with custom metrics support, on some other application-provided metrics).
There are three groups of metrics that we can use with the Horizontal Pod Autoscaler:
resource metrics: predefined resource usage metrics (CPU and
memory) of pods and nodes.
custom metrics: custom metrics associated with a Kubernetes
object.
external metrics: custom metrics not associated with a
Kubernetes object.
Any HPA target can be scaled based on the resource usage of the pods (or containers) in the scaling target. The CPU utilization metric is a resource metric, you can specify other resource metrics besides CPU (e.g. memory). This seems to be the easiest and most basic method of scaling, but we can use more specific metrics by using custom metrics or external metrics.
There is one major difference between custom metrics and external metrics (see: Custom and external metrics for autoscaling workloads):
Custom metrics and external metrics differ from each other:
A custom metric is reported from your application running in Kubernetes.
An external metric is reported from an application or service not running on your cluster, but whose performance impacts your Kubernetes application.
All in all, in my opinion it is okay to use custom metrics in the case above,
I did not find any other suitable way to accomplish this task.

Send kubernetes(GKE) service layer metrics to GCP Load Balancer

I am using GKE and have an application-app1(pod) which is exposed using NodePort and then put behind an ingress.
The ingress-controller has launched a GCP load balancer. Now, the requests coming on path /app1/ are routed to my application.
I launched the stackdriver-metrics adapter inside the cluster and then I configured an HPA which uses requests/second metrics from the load balancer. HPA gets the metrics from ExternalMetric for a particular backend name.
- external:
metricName: loadbalancing.googleapis.com|https|request_count
metricSelector:
matchLabels:
resource.labels.backend_target_name: k8s-be-30048--my-backend
targetAverageValue: 20
type: External
Everything works perfectly. Here is the problem,
Some of the other apps which are also running inside the kubernetes cluster are also calling this app1. Those other apps inside the cluster are calling the app1 by the kubernetes FQDN app1.default.svc.cluster.local and not via the load balancer route. That means these requests won't go throught the ingress loadbalancer. That will mean that these requests are not being counted by the HPA in any way.
So, that menans the total requests(Ct) coming are via LoadBalancer(C1) and via FQDN(C2), Ct = C1 + C2. My guess is that hpa will only take C1 into account and not Ct. My hpa will not scale my app accordingly because of the way metrics are being counted here. For example, if Ct is 120 but C1 is 90 then number of pods will be 3 but it should acutally be 4.
Am I wrong here to consider that requests coming via FQDN are not counted by the load balancer?
If the requests are being counted I think I will have to use something which counts requests on the pod level. Something like a prometheus middleware. Can you guys suggest anything else?
Answering following comment:
Yup, that's the obstruction. No way to forecast/relate the kind of traffic. Anyway, how would it help if it could be forecasted?
If it could be forecasted (for example it's always 70%(external)/30%(internal) you could adjust the scaling metric to already include the traffic that the loadbalancer metric isn't aware of.
Instead of collecting metrics from the load balancer itself which will not take into consideration the internal traffic, you can opt to use Custom Metrics (for example: queries per second).
Your application can report a custom metric to Cloud Monitoring. You can configure Kubernetes to respond to these metrics and scale your workload automatically. For example, you can scale your application based on metrics such as queries per second, writes per second, network performance, latency when communicating with a different application, or other metrics that make sense for your workload.
A custom metric can be selected for any of the following:
A particular node, Pod, or any Kubernetes object of any kind, including a CustomResourceDefinition (CRD).
The average value for a metric reported by all Pods in a Deployment
-- Cloud.google.com: Kubernetes Engine: Custom and external metrics: Custom metrics
There is an official documentation about creating Custom Metrics:
Cloud.google.com: Monitoring: Custom metics: Creating metrics
You can also look on already available metrics in the Metrics Explorer.
You can also use multiple metrics when scaling up/down with HPA:
If you configure a workload to autoscale based on multiple metrics, HPA evaluates each metric separately and uses the scaling algorithm to determine the new workload scale based on each one. The largest scale is selected for the autoscale action.
-- Cloud.google.com: Kubernetes Engine: HorizontalPodAutoscaler
As for more of a workaround solution you could also use the CPU usage metric.
Additional resources:
Cloud.google.com: Kubernetes Engine: Tutorials: Autoscaling metrics: Custom metrics
Cloud.google.com: Kubernetes Engine: How to: Horizontal pod autoscaling

Kubernetes Autoscaling additional custom metrics

I want to add custom metrics to my existing cpu metric, so I want two metrics.
My second metrics has to be a custom metric/external metrics which makes a request to a webserver and gets there a value, is this possible?
At the moment it looks like this but I want to add a second metric but how?
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
As I read in the docs kubernetes would use the higher metric to scale thats ok.
Does someone have an example for me how to apply this custom metric in my case?
If it's an external metric (i.e. a custom metric that is not associated with a Kubernetes object):
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
- type: External
external:
metric:
name: your_metric
target:
type: Value
value: "100"
Note that in this case, the HPA will try to query the your_metric metric from the External Metrics API. That means, this API must exist in your cluster for this configuration to work.
If the metric is associated with a Kubernetes object, you would use a type: Object and if the metric is from the pods of the pod controller (e.g. Deployment) that you are trying to autoscale, you would use a type: Pods. In both cases, the HPA would try to fetch the metric from the Custom Metrics API.
Note (because it seems that you are trying to use a metric that is not yet in Kubernetes):
The HPA can only talk to the metric APIs: Resource Metrics API, Custom Metrics API, External Metrics API.
If your metric is not served by one of these APIs, then you have to create a metrics pipeline that brings the metric to one of these APIs.
For example, using Prometheus and the Prometheus Adapter:
Prometheus periodically scrapes the metric from your external web server
Prometheus Adapter exposes the metric through the External Metrics API
EDIT: explain metric APIs.
In the diagrams below, the green components are those that you need to install to provide the corresponding metric API.
Resource Metrics API
Serves CPU and memory usage metrics of all Pods and Nodes in the cluster. These are predefined metric (in contrast to the custom metrics of the other two APIs).
The raw data for the metrics is collected by cAdvisor which runs as part of the kubelet on each node. The metrics are exposed by the Metrics Server.
The Metrics Server implements the Resource Metrics API. It is not installed by default in Kubernetes. That means, to enable the Resource Metrics API in your cluster, you have to install the Metrics Server.
Custom Metrics API
Serves custom metrics that are associated with Kubernetes objects. The metrics can be anything you want.
You are responsible yourself for collecting the metrics that you want to expose through the Custom Metrics API. You do this by installing a "metrics pipeline" in the cluster.
You can choose the components for your metrics pipeline yourself. The only requirement is that the metrics pipeline is able to:
Collect metrics
Implement the Custom Metrics API
A popular choice for a metrics pipeline is to use Prometheus and the Prometheus Adapter:
Prometheus collects metrics (any metrics you want)
Prometheus Adapter implements the Custom Metrics API and exposes the metrics collected by Prometheus through the Custom Metrics API
External Metrics API
Serves custom metrics that are not associated with Kubernetes objects.
The External Metrics API works identical to the Custom Metrics API. The only difference is that it has different API paths (that don't include objects, but only metric names).
To provide the External Metrics API you can, in most cases, use the same metrics pipeline as for the Custom Metrics API (e.g. Prometheus and the Prometheus Adapter).

Prometheus is not collecting pod metrics

I deployed Prometheus and Grafana into my cluster.
When I open the dashboards I don't get data for pod CPU usage.
When I check Prometheus UI, it shows pods 0/0 up, however I have many pods running in my cluster.
What could be the reason? I have node exporter running in all of nodes.
Am getting this for kube-state-metrics,
I0218 14:52:42.595711 1 builder.go:112] Active collectors: configmaps,cronjobs,daemonsets,deployments,endpoints,horizontalpodautoscalers,jobs,limitranges,namespaces,nodes,persistentvolumeclaims,persistentvolumes,poddisruptionbudgets,pods,replicasets,replicationcontrollers,resourcequotas,secrets,services,statefulsets
I0218 14:52:42.595735 1 main.go:208] Starting metrics server: 0.0.0.0:8080
Here is my Prometheus config file:
https://gist.github.com/karthikeayan/41ab3dc4ed0c344bbab89ebcb1d33d16
I'm able to hit and get data for:
http://localhost:8080/api/v1/nodes/<my_worker_node>/proxy/metrics/cadvisor
As it was mentioned by karthikeayan in comments:
ok, i found something interesting in the values.yaml comments, prometheus.io/scrape: Only scrape pods that have a value of true, when i remove this relabel_config in k8s configmap, i got the data in prometheus ui.. unfortunately k8s configmap doesn't have comments, i believe helm will remove the comments before deploying it.
And just for clarification:
kube-state-metrics vs. metrics-server
The metrics-server is a project that has been inspired by Heapster and is implemented to serve the goals of the Kubernetes Monitoring Pipeline. It is a cluster level component which periodically scrapes metrics from all Kubernetes nodes served by Kubelet through Summary API. The metrics are aggregated, stored in memory and served in Metrics API format. The metric-server stores the latest values only and is not responsible for forwarding metrics to third-party destinations.
kube-state-metrics is focused on generating completely new metrics from Kubernetes' object state (e.g. metrics based on deployments, replica sets, etc.). It holds an entire snapshot of Kubernetes state in memory and continuously generates new metrics based off of it. And just like the metric-server it too is not responsibile for exporting its metrics anywhere.
Having kube-state-metrics as a separate project also enables access to these metrics from monitoring systems such as Prometheus.