RabbitMQ and Prometheus - cannot get prometheus to scrape from targets - kubernetes

I'm trying to set up monitoring of RabbitMQ using Prometheus in a Kubernetes cluster.
I've been following the guide on how to set this up using both the Prometheus and RabbitMQ operators for Kubernetes. However when I deploy the PodMonitor and ServiceMonitors to the cluster, Prometheus doesn't seem to be scraping the metrics as expected.
I've correctly set the metadata.labels.release property in the YAML for these two resources, and can see them listed in Prometheus' Status -> Service Discovery UI, but the 'active targets' always reports 0.
My current suspicion is that there are no prometheus or prometheus-tls ports declared against the rabbitMQ service in the cluster, which is what the ServiceMonitor is expecting to scrape from. Presumably declaring this port on the Service is controlled by the rabbitMQ cluster operator. The documentation doesn't mention any additional steps to set up these ports, so I'm not sure if I am understanding the problem correctly.
Update:
I have confirmed my suspicions by manually configuring the prometheus port by copying the additionalPorts example from the operator's GitHub repo, but it's still not clear to me whether this is expected.
Now the first ServiceMonitor correctly reports 2/2 targets are up. A second ServiceMonitor is also configured by following the guide, but it still doesn't work. It is trying to scrape metrics from a /metrics/detailed endpoint and getting a 404 error.

Related

Installed prometheus-community / helm-charts but I can't get metrics on "default" namespace

I recently learned about helm and how easy it is to deploy the whole prometheus stack for monitoring a Kubernetes cluster, so I decided to try it out on a staging cluster at my work.
I started by creating a dedicates namespace on the cluster for monitoring with:
kubectl create namespace monitoring
Then, with helm, I added the prometheus-community repo with:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
Next, I installed the chart with a prometheus release name:
helm install prometheus prometheus-community/kube-prometheus-stack -n monitoring
At this time I didn't pass any custom configuration because I'm still trying it out.
After the install is finished, it all looks good. I can access the prometheus dashboard with:
kubectl port-forward prometheus-prometheus-kube-prometheus-prometheus-0 9090 -n monitoring
There, I see a bunch of pre-defined alerts and rules that are monitoring but the problem is that I don't quite understand how to create new rules to check the pods in the default namespace, where I actually have my services deployed.
I am looking at http://localhost:9090/graph to play around with the queries and I can't seem to use any that will give me metrics on my pods in the default namespace.
I am a bit overwhelmed with the amount of information so I would like to know what did I miss or what am I doing wrong here?
The Prometheus Operator includes several Custom Resource Definitions (CRDs) including ServiceMonitor (and PodMonitor). ServiceMonitor's are used to define services to the Operator to be monitored.
I'm familiar with the Operator although not the Helm deployment but I suspect you'll want to create ServiceMonitors to generate metrics for your apps in any (including default) namespace.
See: https://github.com/prometheus-operator/prometheus-operator#customresourcedefinitions
ServiceMonitors and PodMonitors are CRDs for Prometheus Operator. When working directly with Prometheus helm chart (without operator), you need have to configure your targets directly in values.yaml by editing the scrape_configs section.
It is more complex to do it, so take a deep breath and start by reading this: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config

Is the prometheus-to-sd required for GKE? Can I delete it?

A while back a GKE cluster got created which came with a daemonset of:
kubectl get daemonsets --all-namespaces
...
kube-system prometheus-to-sd 6 6 6 3 6 beta.kubernetes.io/os=linux 355d
Can I delete this daemonset without issue?
What is it being used for?
What functionality would I be losing without it?
TL;DR
Even if you delete it, it will be back.
A little bit more explanation
Citing explanation by user #Yasen what prometheus-to-sd is:
prometheus-to-sd is a simple component that can scrape metrics stored in prometheus text format from one or multiple components and push them to the Stackdriver. Main requirement: k8s cluster should run on GCE or GKE.
Github.com: Prometheus-to-sd
Assuming that the command deleting this daemonset will be:
$ kubectl delete daemonset prometheus-to-sd --namespace=kube-system
Executing this command will indeed delete the daemonset but it will be back after a while.
prometheus-to-sd daemonset is managed by Addon-Manager which will recreate deleted daemonset back to original state.
Below is the part of the prometheus-to-sd daemonset YAML definition which states that this daemonset is managed by addonmanager:
labels:
addonmanager.kubernetes.io/mode: Reconcile
You can read more about it by following: Github.com: Kubernetes: addon-manager
Deleting this daemonset is strictly connected to the monitoring/logging solution you are using with your GKE cluster. There are 2 options:
Stackdriver logging/monitoring
Legacy logging/monitoring
Stackdriver logging/monitoring
You need to completely disable logging and monitoring of your GKE cluster to delete this daemonset.
You can do it by following a path:
GCP -> Kubernetes Engine -> Cluster -> Edit -> Kubernetes Engine Monitoring -> Set to disabled.
Legacy logging/monitoring
If you are using a legacy solution which is available to GKE version 1.14, you need to disable the option of Legacy Stackdriver Monitoring by following the same path as above.
Let me know if you have any questions in that.
TL;DR - it's ok
Assuming your context, I suppose, it's ok to shutdown prometheus component of your cluster.
Except cases when reports, alerts and monitoring - are critical parts of your system.
Let dive in the sources of GCP
As per source code at GoogleCloudPlatform:
prometheus-to-sd is a simple component that can scrape metrics stored in prometheus text format from one or multiple components and push them to the Stackdriver. Main requirement: k8s cluster should run on GCE or GKE.
Prometheus
From their Prometheus Github Page:
The Prometheus monitoring system and time series database.
To get a picture what is it for - you can read awesome guide on Prometheus: Prometheus Monitoring : The Definitive Guide in 2019 – devconnected
Also, there are hundreds of videos on their Youtube channel Prometheus Monitoring
Your questions
So, answering to your questions:
Can I delete this daemonset without issue?
It depends. As I said, you can. Except cases when reports, alerts and monitoring - are critical parts of your system.
What is it being used for
It's a TSDB for monitoring
what functionality would I be loosing without it?
metrics
→ therefore dashboards
→ therefore alerting

HTTP codes monitoring for Kubernetes cluster using MetalLB ingress controller

Having a cluster running on VMs on our private cloud and using MetalLB as ingress-controller we need to see the network traffic and HTTP codes returned from our applications to see in Grafana HTTP requests and traffic load the way you see it on AWS Load Balancers for example.
We have deployed Prometheus through the Helm deployment in all nodes so we can gather metrics from all the cluster but didn't find any metric containing the needed information. Tried looking the metrics in Prometheus about ingresses, proxy, http but there is nothing matching our need. Also tried some Grafana dashboards from the repository but nothing shows the metrics.
Thanks.

HPA using Kafka Exporter in on premise Kubernetes cluster

I had been trying to implement Kubernetes HPA using Metrics from Kafka-exporter. Hpa supports Prometheus, so we tried writing the metrics to prometheus instance. From there, we are unclear on the steps to do. Is there an article where it will explain in details ?
I followed https://medium.com/google-cloud/kubernetes-hpa-autoscaling-with-kafka-metrics-88a671497f07
for same in GCP and we used stack driver, and the implementation worked like a charm. But, we are struggling in on-premise setup, as stack driver needs to be replaced by Prometheus
In order to scale based on custom metrics, Kubernetes needs to query an API for metrics to check for those metrics. That API needs to implement the custom metrics interface.
So for Prometheus, you need to setup an API that exposes Prometheus metrics through the custom metrics API. Luckily, there already is an adapter.
When I implemented Kubernetes HPA using Metrics from Kafka-exporter I had a few setbacks which I solved doing the following:
I deployed the kafka-exporter container as a sidecar to the pods I
wanted to scale. I found that the HPA scales the pod it gets the
metrics from.
I used annotations to make Prometheus scrape the metrics from the pods with exporter.
Then I verified that the kafka-exporter metrics are getting to Prometheus. If it's not there you can't advance further.
I deployed prometheus adapter using its helm chart. The adapter will "translate" Prometheus's metrics into custom Metrics
Api, which will make it visible to HPA.
I made sure that the metrics are visible in k8s by executing kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1 from one of
the master nodes.
I created an hpa with the matching metric name.
Here is a complete guide explaining how to implement Kubernetes HPA using Metrics from Kafka-exporter
Please comment if you have more questions

Prometheus is not collecting pod metrics

I deployed Prometheus and Grafana into my cluster.
When I open the dashboards I don't get data for pod CPU usage.
When I check Prometheus UI, it shows pods 0/0 up, however I have many pods running in my cluster.
What could be the reason? I have node exporter running in all of nodes.
Am getting this for kube-state-metrics,
I0218 14:52:42.595711 1 builder.go:112] Active collectors: configmaps,cronjobs,daemonsets,deployments,endpoints,horizontalpodautoscalers,jobs,limitranges,namespaces,nodes,persistentvolumeclaims,persistentvolumes,poddisruptionbudgets,pods,replicasets,replicationcontrollers,resourcequotas,secrets,services,statefulsets
I0218 14:52:42.595735 1 main.go:208] Starting metrics server: 0.0.0.0:8080
Here is my Prometheus config file:
https://gist.github.com/karthikeayan/41ab3dc4ed0c344bbab89ebcb1d33d16
I'm able to hit and get data for:
http://localhost:8080/api/v1/nodes/<my_worker_node>/proxy/metrics/cadvisor
As it was mentioned by karthikeayan in comments:
ok, i found something interesting in the values.yaml comments, prometheus.io/scrape: Only scrape pods that have a value of true, when i remove this relabel_config in k8s configmap, i got the data in prometheus ui.. unfortunately k8s configmap doesn't have comments, i believe helm will remove the comments before deploying it.
And just for clarification:
kube-state-metrics vs. metrics-server
The metrics-server is a project that has been inspired by Heapster and is implemented to serve the goals of the Kubernetes Monitoring Pipeline. It is a cluster level component which periodically scrapes metrics from all Kubernetes nodes served by Kubelet through Summary API. The metrics are aggregated, stored in memory and served in Metrics API format. The metric-server stores the latest values only and is not responsible for forwarding metrics to third-party destinations.
kube-state-metrics is focused on generating completely new metrics from Kubernetes' object state (e.g. metrics based on deployments, replica sets, etc.). It holds an entire snapshot of Kubernetes state in memory and continuously generates new metrics based off of it. And just like the metric-server it too is not responsibile for exporting its metrics anywhere.
Having kube-state-metrics as a separate project also enables access to these metrics from monitoring systems such as Prometheus.