We have components which use the Go library to write status to prometheus,
we are able to see the data in Prometheus UI,
we have components outside the K8S cluster which need to pull the data from
Prometheus , how can I expose this metrics? is there any components which I should use ?
You may want to check the Federation section of the Prometheus documents.
Federation allows a Prometheus server to scrape selected time series
from another Prometheus server. Commonly, it is used to either achieve scalable Prometheus monitoring setups or to pull related metrics from one service's Prometheus into another.
It would require to expose Prometheus service out of the cluster with Ingress or nodePort and configure the Center Prometheus to scrape metrics from the exposed service endpoint. You will have set also some proper authentication. Here`s an example of it.
Second way that comes to my mind is to use Kube-state-metrics
kube-state-metrics is a simple service that listens to the Kubernetes
API server and generates metrics about the state of the objects.
Metrics are exported on the HTTP endpoint and designed to be consumed either by Prometheus itself or by scraper that is compatible with Prometheus client endpoints. However this differ from the Metrics Server and generate metrics about the state of Kubernetes objects: node status, node capacity, number of desired replicas, pod status etc.
Related
I'm looking to monitor the number of active Endpoints in NodePort and ClusterIP service. There are several cases when my pods restart or get destroyed. So its important to know if atleast one Endpoint is there to serve the incoming request.
Is there some metric for it in cAdvisor that I can expose via Prometheus? If not is there some way to track this?
I have no idea about cadvicer for monitoring endpoints but by using prometheus blackbox we can achieve this.
The Prometheus Blackbox exporter allows endpoints exploration over several protocols, such as HTTP(S), DNS, TCP, and ICMP. This exporter generates multiple metrics on your configured targets, like general endpoint status, response time, redirect information,Detecting endpoint failures, or certificate expiration dates.
Refer to this link for installation and to configure the setup.
I am using GKE and have an application-app1(pod) which is exposed using NodePort and then put behind an ingress.
The ingress-controller has launched a GCP load balancer. Now, the requests coming on path /app1/ are routed to my application.
I launched the stackdriver-metrics adapter inside the cluster and then I configured an HPA which uses requests/second metrics from the load balancer. HPA gets the metrics from ExternalMetric for a particular backend name.
- external:
metricName: loadbalancing.googleapis.com|https|request_count
metricSelector:
matchLabels:
resource.labels.backend_target_name: k8s-be-30048--my-backend
targetAverageValue: 20
type: External
Everything works perfectly. Here is the problem,
Some of the other apps which are also running inside the kubernetes cluster are also calling this app1. Those other apps inside the cluster are calling the app1 by the kubernetes FQDN app1.default.svc.cluster.local and not via the load balancer route. That means these requests won't go throught the ingress loadbalancer. That will mean that these requests are not being counted by the HPA in any way.
So, that menans the total requests(Ct) coming are via LoadBalancer(C1) and via FQDN(C2), Ct = C1 + C2. My guess is that hpa will only take C1 into account and not Ct. My hpa will not scale my app accordingly because of the way metrics are being counted here. For example, if Ct is 120 but C1 is 90 then number of pods will be 3 but it should acutally be 4.
Am I wrong here to consider that requests coming via FQDN are not counted by the load balancer?
If the requests are being counted I think I will have to use something which counts requests on the pod level. Something like a prometheus middleware. Can you guys suggest anything else?
Answering following comment:
Yup, that's the obstruction. No way to forecast/relate the kind of traffic. Anyway, how would it help if it could be forecasted?
If it could be forecasted (for example it's always 70%(external)/30%(internal) you could adjust the scaling metric to already include the traffic that the loadbalancer metric isn't aware of.
Instead of collecting metrics from the load balancer itself which will not take into consideration the internal traffic, you can opt to use Custom Metrics (for example: queries per second).
Your application can report a custom metric to Cloud Monitoring. You can configure Kubernetes to respond to these metrics and scale your workload automatically. For example, you can scale your application based on metrics such as queries per second, writes per second, network performance, latency when communicating with a different application, or other metrics that make sense for your workload.
A custom metric can be selected for any of the following:
A particular node, Pod, or any Kubernetes object of any kind, including a CustomResourceDefinition (CRD).
The average value for a metric reported by all Pods in a Deployment
-- Cloud.google.com: Kubernetes Engine: Custom and external metrics: Custom metrics
There is an official documentation about creating Custom Metrics:
Cloud.google.com: Monitoring: Custom metics: Creating metrics
You can also look on already available metrics in the Metrics Explorer.
You can also use multiple metrics when scaling up/down with HPA:
If you configure a workload to autoscale based on multiple metrics, HPA evaluates each metric separately and uses the scaling algorithm to determine the new workload scale based on each one. The largest scale is selected for the autoscale action.
-- Cloud.google.com: Kubernetes Engine: HorizontalPodAutoscaler
As for more of a workaround solution you could also use the CPU usage metric.
Additional resources:
Cloud.google.com: Kubernetes Engine: Tutorials: Autoscaling metrics: Custom metrics
Cloud.google.com: Kubernetes Engine: How to: Horizontal pod autoscaling
Having a cluster running on VMs on our private cloud and using MetalLB as ingress-controller we need to see the network traffic and HTTP codes returned from our applications to see in Grafana HTTP requests and traffic load the way you see it on AWS Load Balancers for example.
We have deployed Prometheus through the Helm deployment in all nodes so we can gather metrics from all the cluster but didn't find any metric containing the needed information. Tried looking the metrics in Prometheus about ingresses, proxy, http but there is nothing matching our need. Also tried some Grafana dashboards from the repository but nothing shows the metrics.
Thanks.
I refer this doc.
I want to send data from my device and visualize it on grafana so, how to connect prometheus(deployed as a cluster in gcp) to GCP pubsub.
Prometheus is pull-based rather than push-based. So, whatever the metrics source is, it must expose the metrics in Prometheus format, and Prometheus will periodically query them with HTTP request.
If directly exposing the metrics is not possible, the metrics source can push the metrics to some intermediate component which exposes the metrics in Prometheus format so that Prometheus can query them.
It seems this is the approach taken by the document you're referring to. The metrics are submitted from the source via PubSub to a Metrics Telemetry Converter pod running in the Kubernetes cluster, which exposes them in Prometheus format.
You then have to configure Prometheus to scrape the metrics from this pod, as you would configure it for any other job.
I deployed Prometheus and Grafana into my cluster.
When I open the dashboards I don't get data for pod CPU usage.
When I check Prometheus UI, it shows pods 0/0 up, however I have many pods running in my cluster.
What could be the reason? I have node exporter running in all of nodes.
Am getting this for kube-state-metrics,
I0218 14:52:42.595711 1 builder.go:112] Active collectors: configmaps,cronjobs,daemonsets,deployments,endpoints,horizontalpodautoscalers,jobs,limitranges,namespaces,nodes,persistentvolumeclaims,persistentvolumes,poddisruptionbudgets,pods,replicasets,replicationcontrollers,resourcequotas,secrets,services,statefulsets
I0218 14:52:42.595735 1 main.go:208] Starting metrics server: 0.0.0.0:8080
Here is my Prometheus config file:
https://gist.github.com/karthikeayan/41ab3dc4ed0c344bbab89ebcb1d33d16
I'm able to hit and get data for:
http://localhost:8080/api/v1/nodes/<my_worker_node>/proxy/metrics/cadvisor
As it was mentioned by karthikeayan in comments:
ok, i found something interesting in the values.yaml comments, prometheus.io/scrape: Only scrape pods that have a value of true, when i remove this relabel_config in k8s configmap, i got the data in prometheus ui.. unfortunately k8s configmap doesn't have comments, i believe helm will remove the comments before deploying it.
And just for clarification:
kube-state-metrics vs. metrics-server
The metrics-server is a project that has been inspired by Heapster and is implemented to serve the goals of the Kubernetes Monitoring Pipeline. It is a cluster level component which periodically scrapes metrics from all Kubernetes nodes served by Kubelet through Summary API. The metrics are aggregated, stored in memory and served in Metrics API format. The metric-server stores the latest values only and is not responsible for forwarding metrics to third-party destinations.
kube-state-metrics is focused on generating completely new metrics from Kubernetes' object state (e.g. metrics based on deployments, replica sets, etc.). It holds an entire snapshot of Kubernetes state in memory and continuously generates new metrics based off of it. And just like the metric-server it too is not responsibile for exporting its metrics anywhere.
Having kube-state-metrics as a separate project also enables access to these metrics from monitoring systems such as Prometheus.