How to use latency of a service deployed on Kubernetes to Scale the deployment? - kubernetes

I have a simple spring boot application deployed on Kubernetes on GCP. The service is exposed to an external IP address. I am load testing this application using JMeter. It is just a http GET request which returns True or False.
I want to get the latency metrics with time to feed it to HorizontalPodAutoscaler to implement custom auto-scaler. How do I implement this?

Since you mentioned Custom Auto Scaler. I would suggest this simple solution which makes use of some of tools which you already might have.
First Part: Is to Create a service or cron or any time-based trigger which will on a regular interval make requests to your deployed application. This application will then store the resultant metrics to persistence storage or file or Database etc.
For example, if you use a simple Apache Benchmark CLI tool(you can also use Jmeter or any other load testing tool which generates structured o/p), You will get a detailed result for a single query. Use this link to get around the result for your reference.
Second Part Is that this same script can also trigger another event which will check for the latency or response time limit configured as per your requirement. If the response time is above the configured value scale if it is below scale down.
The logic for scaling down can be more trivial, But I will leave that to you.
Now for actually scaling the deployment, you can use the Kubernetes API. You can refer to the official doc or this answer for details. Here's a simple flow diagram.

There are two ways to auto scale with custom metrics:
1.You can export a custom metric from every Pod in the Deployment and target the average value per Pod.
2.You can export a custom metric from a single Pod outside of the Deployment and target the total value.
So follow these-
1. To grant GKE objects access to metrics stored in Stackdriver, you need to deploy the Custom Metrics Stackdriver Adapter. To run Custom Metrics Adapter, you must grant your user the ability to create required authorization roles by running the following command:
kubectl create clusterrolebinding cluster-admin-binding \
--clusterrole cluster-admin --user "$(gcloud config get-value account)"
To deploy adapter-
kubectl create -f https://raw.githubusercontent.com/GoogleCloudPlatform/k8s-stackdriver/master/custom-metrics-stackdriver-adapter/deploy/production/adapter.yaml
You can export your metrics to Stackdriver either directly from your application, or by exposing them in Prometheus format and adding the Prometheus-to-Stackdriver adapter to your Pod's containers.
You can view the exported metrics from the Metrics Explorer by searching for custom/[METRIC_NAME]
Your metric needs to meet the following requirements:
Metric kind must be GAUGE
Metric type can be either DOUBLE or INT64
Metric name must start with custom.googleapis.com/ prefix, followed by a simple name
Resource type must be "gke_container"
Resource labels must include:
pod_id set to Pod UID, which can be obtained via the Downward API
container_name = ""
project_id, zone, cluster_name, which can be obtained by your application from the metadata server. To get values, you can use Google Cloud's compute metadata client.
namespace_id, instance_id, which can be set to any value.
3.Once you have exported metrics to Stackdriver, you can deploy a HPA to scale your Deployment based on the metrics.
Vie this on GitHub for additional codes

Related

How to expose cluster+project values to container in GKE (or current-context in k8s)

My container code needs to know in which environment it is running on GKE, more specifically what cluster and project. In standard kubernetes, this could be retrieved from current-context value (gke_<project>_<cluster>).
Kubernetes has a downward api that can push pod info to containers - see https://kubernetes.io/docs/tasks/inject-data-application/downward-api-volume-expose-pod-information/ - but unfortunately nothing from "higher" entities.
Any thoughts on how this can be achieved ?
Obviously I do not want to explicit push any info at deployment (e.g. as env in the configMap). I rather deploy using a generic/common yaml and have the code at runtime retrieve the info from env or file and branch accordingly.
You can query the GKE metadata server from within your code. In your case, you'd want to query the /computeMetadata/v1/instance/attributes/cluster-name and /computeMetadata/v1/project/project-id endpoints to get the cluster and project. The client libraries for each supported language all have simple wrappers for accessing the metadata API as well.

HorizontalPodAutoscaler: Is the published documentation for deploying the custom metrics stackdriver adapter in GKE complete and correct?

Google publishes a tutorial for using custom metrics to drive the HorizontalPodAutoscaler here, and this tutorial contains instructions for:
Using a Kubernetes manifest to deploy the custom metrics adapter into a custom-metrics namespace.
Deploying a dummy application to generate metrics.
Configuring the HPA to use custom metrics.
We are deploying into a default cluster without any special VPC rules, and we have roughly followed the tutorial's guidance, with a few exceptions:
We're using Helm v2, and rather than grant cluster admin role to Tiller, we have granted all of the necessary cluster roles and role bindings to allow the custom-metrics-adapter-deploying Kubernetes manifest to work. We see no issues there; at least the custom metrics adapter spins up and runs.
We have defined some custom metrics that are based upon data extracted from a jsonPayload in Stackdriver logs.
We have deployed a minute-by-minute CronJob that reads the above metrics and publishes a derived metric, which is the value we want to use to drive the autoscaler. The CronJob is working, and we can see the metric in the derived metric, on a per-Pod basis, in the log metric explorer:
We're configuring the HPA to scale based on the average of the derived metric across all of the pods belonging to a stateful set (The HPA has a metrics entry with type Pods). However, the HPA is unable to read our derived metric. We see this error message:
failed to get object metric value: unable to get metric xxx_scaling_metric: no metrics returned from custom metrics API
Update
We were seeing DNS errors, but these were apparently false alarms, perhaps in the log while the cluster was spinning up.
We restarted the Stackdriver metrics adapter with the command line option --v=5 to get some more verbose debugging. We see log entries like these:
I0123 20:23:08.069406 1 wrap.go:47] GET /apis/custom.metrics.k8s.io/v1beta1/namespaces/defaults/pods/%2A/xxx_scaling_metric: (56.16652ms) 200 [kubectl/v1.13.11 (darwin/amd64) kubernetes/2e298c7 10.44.1.1:36286]
I0123 20:23:12.997569 1 translator.go:570] Metric 'xxx_scaling_metric' not found for pod 'xxx-0'
I0123 20:23:12.997775 1 wrap.go:47] GET /apis/custom.metrics.k8s.io/v1beta2/namespaces/default/pods/%2A/xxx_scaling_metric?labelSelector=app%3Dxxx: (98.101205ms) 200 [kube-controller-manager/v1.13.11 (linux/amd64) kubernetes/56d8986/system:serviceaccount:kube-system:horizontal-pod-autoscaler 10.44.1.1:36286]
So it looks to us as if the HPA is making the right query for pods-based custom metrics. If we ask the custom metrics API what data it has, and filter with jq to our metric of interest, we see:
{"kind":"MetricValueList",
"apiVersion":"custom.metrics.k8s.io/v1beta1",
"metadata: {"selfLink":"/apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/%2A/xxx_scaling_metric"},
"items":[]}
That the items array is empty is troubling. Again, we can see data in the metrics explorer, so we're left to wonder if our CronJob app that publishes our scaling metric is supplying the right fields in order for the data to be saved in Stackdriver or exposed through the metrics adapter.
For what it's worth the resource.labels map for the time series that we're publishing in our CronJob looks like:
{'cluster_name': 'test-gke',
'zone': 'us-central1-f',
'project_id': 'my-project-1234',
'container_name': '',
'instance_id': '1234567890123456789',
'pod_id': 'xxx-0',
'namespace_id': 'default'}
We finally solved this. Our CronJob that's publishing the derived metric we want to use is getting its raw data from two other metrics that are extracted from Stackdriver logs, and calculating a new value that it publishes back to Stackdriver.
We were using the resource labels that we saw from those metrics when publishing our derived metric. The POD_ID resource label value in the "input" Stackdriver metrics we are reading is the name of the pod. However, the stackdriver custom metrics adapter at gcr.io/google-containers/custom-metrics-stackdriver-adapter:v0.10.0 is enumerating pods in a namespace and asking stackdriver for data associated with pods' UIDs, not their names. (Read the adapter's source code to figure this out...)
So our CronJob now builds a map of pod names to pod UIDs (which requires it to have RBAC pod list and get roles), and publishes the derived metric we use for HPA with the POD_ID set to the pod's UID instead of its name.
The reason that published examples of custom metrics for HPA (like this) work is that they use the Downward API to get a pod's UID, and provide that value as "POD_ID". In retrospect, that should have been obvious, if we had looked at how the "dummy" metrics exporters got their pod id values, but there are certainly examples (as in Stackdriver logging metrics) where POD_ID ends up being a name and not a UID.

What is the difference between the core os projects kube-prometheus and prometheus operator?

The github repo of Prometheus Operator https://github.com/coreos/prometheus-operator/ project says that
The Prometheus Operator makes the Prometheus configuration Kubernetes native and manages and operates Prometheus and Alertmanager clusters. It is a piece of the puzzle regarding full end-to-end monitoring.
kube-prometheus combines the Prometheus Operator with a collection of manifests to help getting started with monitoring Kubernetes itself and applications running on top of it.
Can someone elaborate this?
I've always had this exact same question/repeatedly bumped into both, but tbh reading the above answer didn't clarify it for me/I needed a short explanation. I found this github issue that just made it crystal clear to me.
https://github.com/coreos/prometheus-operator/issues/2619
Quoting nicgirault of GitHub:
At last I realized that prometheus-operator chart was packaging
kube-prometheus stack but it took me around 10 hours playing around to
realize this.
**Here's my summarized explanation:
"kube-prometheus" and "Prometheus Operator Helm Chart" both do the same thing:
Basically the Ingress/Ingress Controller Concept, applied to Metrics/Prometheus Operator.
Both are a means of easily configuring, installing, and managing a huge distributed application (Kubernetes Prometheus Stack) on Kubernetes:**
What is the Entire Kube Prometheus Stack you ask? Prometheus, Grafana, AlertManager, CRDs (Custom Resource Definitions), Prometheus Operator(software bot app), IaC Alert Rules, IaC Grafana Dashboards, IaC ServiceMonitor CRDs (which auto-generate Prometheus Metric Collection Configuration and auto hot import it into Prometheus Server)
(Also when I say easily configuring I mean 1,000-10,000++ lines of easy for humans to understand config that generates and auto manage 10,000-100,000 lines of machine config + stuff with sensible defaults + monitoring configuration self-service, distributed configuration sharding with an operator/controller to combine config + generate verbose boilerplate machine-readable config from nice human-readable config.
If they achieve the same end goal, you might ask what's the difference between them?
https://github.com/coreos/kube-prometheus
https://github.com/helm/charts/tree/master/stable/prometheus-operator
Basically, CoreOS's kube-prometheus deploys the Prometheus Stack using Ksonnet.
Prometheus Operator Helm Chart wraps kube-prometheus / achieves the same end result but with Helm.
So which one to use?
Doesn't matter + they achieve the same end result + shouldn't be crazy difficult to start with 1 and switch to the other.
Helm tends to be faster to learn/develop basic mastery of.
Ksonnet is harder to learn/develop basic mastery of, but:
it's more idempotent (better for CICD automation) (but it's only a difference of 99% idempotent vs 99.99% idempotent.)
has built-in templating which means that if you have multiple clusters you need to manage / that you want to always keep consistent with each other. Then you can leverage ksonnet's templating to manage multiple instances of the Kube Prometheus Stack (for multiple envs) using a DRY code base with lots of code reuse. (If you only have a few envs and Prometheus doesn't need to change often it's not completely unreasonable to keep 4 helm values files in sync by hand. I've also seen Jinja2 templating used to template out helm values files, but if you're going to bother with that you may as well just consider ksonnet.)
Kubernetes operator are kubernetes specific application(pods) that configure, manage and optimize other Kubernetes deployments automatically. They are implemented as a custom controller.
According to official coreOS website:
Operators were introduced by CoreOS as a class of software that operates other software, putting operational knowledge collected by humans into software.
The prometheus operator provides the easy way to deploy configure and monitor your prometheus instances on kubernetes cluster. To do so, prometheus operator introduces three types of custom resource definition(CRD) in kubernetes.
Prometheus
Alertmanager
ServiceMonitor
Now, with the help of above CRD's, you can directly create a prometheus instance by providing kind: Prometheus and the prometheus instance is ready to serve, likewise you can do for AlertManager. Without this you would have to setup the deployment for prometheus with its image, configuration and many more things.
The Prometheus Operator serves to make running Prometheus on top of Kubernetes as easy as possible, while preserving Kubernetes-native configuration options.
Now, kube-prometheus implemented the prometheus operator and provides you minimum yaml files to create your basic setup of prometheus, alertmanager and grafana by running a single command.
git clone https://github.com/coreos/prometheus-operator.git
kubectl apply -f prometheus-operator/contrib/kube-prometheus/manifests/
By running above command in kube-prometheus directory, you will get a monitoring namespace which will have an instance of alertmanager, prometheus and grafana for UI. This is enough setup for most of the basic implementation and if you need any more specifics according to your application, you can add more yamls of exporter you need.
Kube-prometheus is more of a contribution to prometheus-operator project, which implements the prometheus operator functionality very well and provide you a complete monitoring setup for your kubernetes cluster. You can start with kube-prometheus and extend the functionality of your monitoring setup according to your application from there.
You can learn more about prometheus-operator here
As of today, 28-09-2020, this is the way to install Prometheus in a Kubernetes cluster
https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack#kube-prometheus-stack
According to official documentation, kube-prometheus-stack is a rename of prometheus-operator.
As I understood, kube-prometheus-stack also has preinstalled grafana dashboards and prometheus rules.
Note: This chart was formerly named prometheus-operator chart, now
renamed to more clearly reflect that it installs the kube-prometheus
project stack, within which Prometheus Operator is only one component.
Taken from https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack
Architecturally the container runs docker
The default container logs are managed by Docker, and the default log driver uses JSON-file
log-driver": "json-file
https://docs.docker.com/config/containers/logging/configure/
If the default jSON-file is used to manage container logs, log rotation is not performed by default. Therefore, the default JSON-file log driver the log files stored by the log driver can result in a large amount of disk space for containers that generate a large amount of output, which can cause disk space to run out.
In this case, save the log to ES, store it separately, and periodically delete the index using curator kubernetes
And run a scheduled task in K8S to delete the index periodically
Another solution for disk space is to periodically delete old logs from jSON-files
Typically we set the size and number of logs
This will set up a maximum of 10 log files, each with a maximum size of 20 Mb. Therefore, the container has a maximum of 200 Mb of logs
"log-driver": "json-file", "log-opts": { "max-size": "20m", "max-file": "10" },
Note: In general, the default Docker log is placed
/var/lib/docker/containers/
But in the same case kubernetes also saves logs and creates a directory structure to help you find pods-based logs, so you can find container logs for each Pod running on a node
/var/log/pods/<namespace>_<pod_name>_<pod_id>/<container_name>/
When removing pod, / var/lib/container under the docker/containers/log and k8s created under/var/log/pods/pod log will be deleted
For example, if the POD is restarted during production, the pod log will be deleted whether it is on the original node or jumped to another node
Therefore, this log needs to be saved in ES for centralized management. Many R&D projects will check the log for troubleshooting in most cases

Fetching Stackdriver Monitoring TimeSeries data for a pod running on a k8s cluster on GKE using the REST API

My objective is to fetch the time series of a metric for a pod running on a kubernetes cluster on GKE using the Stackdriver TimeSeries REST API.
I have ensured that Stackdriver monitoring and logging are enabled on the kubernetes cluster.
Currently, I am able to fetch the time series of all the resources available in a cluster using the following filter:
metric.type="container.googleapis.com/container/cpu/usage_time" AND resource.labels.cluster_name="<MY_CLUSTER_NAME>"
In order to fetch the time series of a given pod id, I am using the following filter:
metric.type="container.googleapis.com/container/cpu/usage_time" AND resource.labels.cluster_name="<MY_CLUSTER_NAME>" AND resource.labels.pod_id="<POD_ID>"
This filter returns an HTTP 200 OK with an empty response body. I have found the pod ID from the metadata.uid field received in the response of the following kubectl command:
kubectl get deploy -n default <SERVICE_NAME> -o yaml
However, when I use the Pod ID of a background container spawned by GKE/Stackdriver, I do get the time series values.
Since I am able to see Stackdriver metrics of my pod on the GKE UI, I believe I should also get the metric values using the REST API.
My doubts/questions are:
Am I fetching the Pod ID of my pod correctly using kubectl?
Could there be some issue with my cluster setup/service deployment due to which I'm unable to fetch the metrics?
Is there some other way in which I can get the time series of my pod using the REST APIs?
I wouldn't rely on kubectl get deploy for pod ids. I would get them with something like kubectl -n default get pods | grep <prefix-for-your-pod> | awk '{print $1}'
I don't think so, but the best way to find out is opening a support ticket with GCP if you have any doubts.
Not that I'm aware of, Stackdriver is the monitoring solution in GCP. Again, you can check with GCP support. There are other tools that you can use to get metrics from Kubernetes like Prometheus. There are multiple guides on the web on how to set it up with Grafana on k8s. This is one for example.
Hope it helps!
Am I fetching the Pod ID of my pod correctly using kubectl?
You could use JSONpath as output with kubectl, in this case iterating over the Pods and fetching the metadata.name and metadata.uid fields:
kubectl get pods -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.metadata.uid}{"\n"}{end}'
which will output something like this:
nginx-65899c769f-2j775 d4fr5t6-bc2f-11e8-81e8-42010a84011f
nginx2-77b5c9d48c-7qlps 4f5gh6r-bc37-11e8-81e8-42010a84011f
Could there be some issue with my cluster setup/service deployment due to which I'm unable to fetch the metrics?
As #Rico mentioned in his answer, contacting the GCP support could be a way forward if you don't get further with the troubleshooting, see below.
Is there some other way in which I can get the time series of my pod using the REST APIs?
You could use the APIs Explorer or the Metrics Explorer from within the Stackdriver portal. There's some good troubleshooting tips here with a link to the APIs Explorer. In the Stackdriver Metrics Explorer it's fairly easy to reassemble the filter you've used using dropdown lists to choose e.g. a particular pod_id.
Taken from the Troubleshooting the Monitoring guide (linked above) regarding an empty HTTP 200 response on filtered queries:
If your API call returns status code 200 and an empty response, there
are several possibilities:
If your call uses a filter, then the filter might not have matched anything. The filter match is case-sensitive. To resolve filter
problems, start by specifying only one filter component, such as
metric.type, and see if you get results. Add the other filter
components one-by-one.
If you are working with a custom metric, you might not have specified the project where your custom metric is defined.*
I found this link when reading through the documentation of the Monitoring API. That link will get you to the APIs Explorer with some pre-filled fields, change these accordingly and add your own filter.
I have not tested more using the REST API at the moment but hopefully this could get you forward.

How to check a specific kubernetes pod is up or not in Grafana metrics

I am creating a grafana dashboard to show metrices.
Is there any keyword which i can use in my query to check my service is running or not.
I am using prometheus to retrive data from my api for the metrics creation.
You can create a query like so
count_scalar(container_last_seen{name=<container_name>})
That will give you the count of how many containers are running with that name