Rancher Cluster Monitoring + Prometheus Operator? - kubernetes

I'm managing several k8s clusters with Rancher. I've setup most of them with cluster monitoring apps from Rancher (so graphs and grafana links show up in rancher under workload monitoring, etc etc).
Question: Is there a way to configure Rancher to pull metrics/graphs from prometheus-operator instead?
I've asked this in Slack, and have not gotten an answer or response at all.
Reason: seems I can configure/add additional configurations (configmaps) to prometheus-operator, that I cannot add to prometheus installed through Rancher's cluster monitoring app.
Rancher installed prometheus-operator, but in the app says to not touch it (screenshot)
Edit:
This is what I was after all along:
additionalScrapeConfigs:[]
https://github.com/rancher/system-charts/blob/dev/charts/rancher-monitoring/v0.0.3/charts/prometheus/values.yaml#L61
and
storageSpec: {}
https://github.com/rancher/system-charts/blob/dev/charts/rancher-monitoring/v0.0.3/charts/prometheus/values.yaml#L35
Unlike in coreos/prometheus-operator chart:
answer for rancher-monitoring app should be:
prometheus:
additionalScrapeConfigs: []
# - job_name: "prometheus"
# static_configs:
# - targets:
# - "localhost:9090"
remoteWrite: []
# - url: http://remote1/push

Related

How to deploy prometheus using prometheus operator?

I'm trying to deploy Prometheus using Prometheus operator. I have used the documentation and helm charts from https://github.com/prometheus-operator/prometheus-operator.
Since I need the charts for future reference, rather then directly installing the charts from repository I made a Chart.yaml file and added the repository as dependency.
apiVersion: v2
description: kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
icon: https://raw.githubusercontent.com/prometheus/prometheus.github.io/master/assets/prometheus_logo-cb55bb5c346.png
engine: gotpl
type: application
maintainers:
- name:
email:
name: kube-prometheus-stack
sources:
- https://github.com/prometheus-community/helm-charts
- https://github.com/prometheus-operator/kube-prometheus
version: 32.2.1
appVersion: 0.54.0
kubeVersion: ">=1.16.0-0"
home: https://github.com/prometheus-operator/kube-prometheus
keywords:
- operator
- prometheus
- kube-prometheus
annotations:
"artifacthub.io/operator": "true"
"artifacthub.io/links": |
- name: Chart Source
url: https://github.com/prometheus-community/helm-charts
- name: Upstream Project
url: https://github.com/prometheus-operator/kube-prometheus
dependencies:
- name: kube-state-metrics
version: "4.4.*"
repository: https://prometheus-community.github.io/helm-charts
condition: kubeStateMetrics.enabled
- name: prometheus-node-exporter
version: "2.5.*"
repository: https://prometheus-community.github.io/helm-charts
condition: nodeExporter.enabled
- name: grafana
version: "6.21.*"
repository: https://grafana.github.io/helm-charts
condition: grafana.enabled
Chart.yaml file
Then I execute the following cmds
hem dependency update
helm install <chartname> .
Every thing works fine but when I check the pods only the operator pod is created and running with other services and grafana.
Is this the default behavior of the Prometheus operator.
I thought it might be the default behavior of Prometheus so I tried to deploy redis-cluster using redis-cluster operator and also rabbitmq-cluster with rabitmq-cluster operator but each one creates only the operator pod and not cluster pods.
an operator pod acts as a controller that listens to events regarding specific custom resources. if you only deploy the operator, you have to seperately deploy the custom resource you wish to be created.
with the prometeus-operator, that would be a custom resource of kind "prometheus". if the helm chart you choose is capable to also deploy this (or not) should be indicated
in the charts values.yaml and documented on their github page.
you can also use the examples from the prometheus-operator repo to create prometheus instances. check out these files to do so: https://github.com/prometheus-operator/prometheus-operator/tree/main/example/rbac/prometheus

Prometheus for k8s multi clusters

I have 3 kubernetes clusters (prod, test, monitoring). Iam new to prometheus so i have tested it by installing it in my test environment with the helm chart:
# https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack
helm install [RELEASE_NAME] prometheus-community/kube-prometheus-stack
But if i want to have metrics from the prod and test clusters, i have to repeat the same installation of the helm and each "kube-prometheus-stack" would be standalone in its own cluster. It is not ideal at all. Iam trying to find a way to have a single prometheus/grafana which would federate/agregate the metrics from each cluster's prometheus server.
I found this link, saying about prometheus federation:
https://prometheus.io/docs/prometheus/latest/federation/
If install the helm chart "kube-prometheus-stack" and get rid of grafana on the 2 other cluster, how can i make the 3rd "kube-prometheus-stack", on the 3rd cluster, scrapes metrics from the 2 other ones?
thanks
You have to modify configuration for prometheus federate so it can scrape metrics from other clusters as described in documentation:
scrape_configs:
- job_name: 'federate'
scrape_interval: 15s
honor_labels: true
metrics_path: '/federate'
params:
'match[]':
- '{job="prometheus"}'
- '{__name__=~"job:.*"}'
static_configs:
- targets:
- 'source-prometheus-1:9090'
- 'source-prometheus-2:9090'
- 'source-prometheus-3:9090'
params field checks for jobs to scrape metrics from. In this particular example
It will scrape any series with the label job="prometheus" or a metric name starting
with job: from the Prometheus servers at
source-prometheus-{1,2,3}:9090
You can check following articles to give you more insight of prometheus federation:
Monitoring Kubernetes with Prometheus - outside the cluster!
Prometheus federation in Kubernetes
Monitoring multiple federated clusters with Prometheus - the secure way
Monitoring a Multi-Cluster Environment Using Prometheus Federation and Grafana
You have few options here:
Option 1:
You can achieve this buy having vmagent or grafana-agent in prod and test clusters and configure remote write on them to your monitoring cluster.
But in this case you will need to install kube-state-metrics and node-exporter separately into prod and test cluster.
Also it's important to add extra label for a cluster name (or any unique identifier) before sending metrics to remote write, to make sure that recording rules from "kube-prometheus-stack" are working correctly
diagram
Option 2:
You can install victoria-metrics-k8s-stack chart. It has similar functionality as kube-prometheus-stack - also installs bunch of components recording rules and dashboards.
With this case you install victoria-metrics-k8s-stack in every cluster, but with different values.
For monitoring cluster you can use default values, with
grafana:
sidecar:
dashboards:
multicluster: true
and proper configured ingress for vmsingle
For prod and test cluster you need to disable bunch of components
defaultRules:
create: false
vmsingle:
enabled: false
alertmanager:
enabled: false
vmalert:
enabled: false
vmagent:
spec:
remoteWrite:
- url: "<vmsingle-ingress>/api/v1/write"
externalLabels:
cluster: <cluster-name>
grafana:
enabled: false
defaultDashboardsEnabled: false
in this case chart will deploy vmagent, kube-state-metrics, node-exporter and scrape configurations for vmagent.
diagram
You could try looking at Wavefront. It's a commercial tool now but you can get a 30 day trial free - also, it understands promQL. So essentially, you could use the same prometheus rules and config across all clusters, and then use wavefront to just connect to all of those prom instances.
Another option may be Thanos, but I've never used it personally.

Prometheus Alert Manager for Federation

We have several clusters where our applications are running. We would like to set up a Central Monitoring cluster which can scrape metrics from rest of cluster using Prometheus Federation.
So to do that, I need to install prometheus server in each of cluster and install prometheus server via federation in central cluster.I will install Grafana as well in central cluster to visualise the metrics that we gather from rest of prometheus server.
So the question is;
Where should I setup the Alert Manager? Only for Central Cluster or each cluster has to be also alert manager?
What is the best practice alerting while using Federation?
I though ı can use ingress controller to expose each prometheus server? What is the best practice to provide communication between prometheus server and federation in k8s?
Based on this blog
Where should I setup the Alert Manager? Only for Central Cluster or each cluster has to be also alert manager?
What is the best practice alerting while using Federation?
The answer here would be to do that on each cluster.
If the data you need to do alerting is moved from one Prometheus to another then you've added an additional point of failure. This is particularly risky when WAN links such as the internet are involved. As far as is possible, you should try and push alerting as deep down the federation hierarchy as possible. For example an alert about a target being down should be setup on the Prometheus scraping that target, not a global Prometheus which could be several steps removed.
I though ı can use ingress controller to expose each prometheus server? What is the best practice to provide communication between prometheus server and federation in k8s?
I think that depends on use case, in each doc I checked they just use targets in scrape_configs.static_configs in the prometheus.yml
like here
scrape_configs:
- job_name: 'federate'
scrape_interval: 15s
honor_labels: true
metrics_path: '/federate'
params:
'match[]':
- '{job="prometheus"}'
- '{__name__=~"job:.*"}'
static_configs:
- targets:
- 'source-prometheus-1:9090'
- 'source-prometheus-2:9090'
- 'source-prometheus-3:9090'
OR
like here
prometheus.yml:
rule_files:
- /etc/config/rules
- /etc/config/alerts
scrape_configs:
- job_name: 'federate'
scrape_interval: 15s
honor_labels: true
metrics_path: '/federate'
params:
'match[]':
- '{job="prometheus"}'
- '{__name__=~"job:.*"}'
static_configs:
- targets:
- 'prometheus-server:80'
Additionally, worth to check how they did this in this tutorial, where they used helm to build central monitoring cluster with two prometheus servers on two clusters.

Configuring prometheus to access another cluster that applications deployed

I am newbie for using monitoring tools such as Prometheus in k8s..We have two separate cluster one for applications we deployed one for we only would like deploy monitoring,logging tools.
But I have some confusion how to handle this?
1.How cluster that serves prometheus can connect to application cluster and able to pull metrics?
2.How should I specify the namespace if I would like to set a network policy?
3.What should I do in application side for helm chart except exporting metrics?
# Allow traffic from pods with label app=prometheus in namespace with label name=monitoring
# to any pod in <YOUR_APPLICATION_NAMESPACE>
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: monitoring.prometheus.all
namespace: <YOUR_APPLICATION_NAMESPACE>
spec:
ingress:
- from:
- namespaceSelector:
matchLabels:
name: monitoring
podSelector:
matchLabels:
app: prometheus
podSelector: {}
policyTypes:
- Ingress
Isn't that what you want?
1) Prometheus federation
Prometheus federation is a Prometheus server that can scrape data
from other Prometheus servers. It supports hierarchical federation,
which in our case resembles a tree.
A default version of the Prometheus server is installed in each one of our clusters and a Prometheus federation server is deployed together with Grafana in a central monitoring cluster. Prometheus federation scrapes data from all the other Prometheus servers that run in our clusters. For future expansion, a central Prometheus federation can be used to scrape data from multiple Prometheus federation servers that scrape data from groups of tens of clusters.
More info here: https://developers.mattermost.com/blog/cloud-monitoring/
2) Prometheus configuration to scrape Kubernetes outside the cluster yaml example
3) Linkedin Monitoring Kubernetes with Prometheus - outside the cluster! article and Reddit Monitoring K8s by Prometheus Outside Cluster related discussion

Prometheus Adapter Custom Metrics for Libvirt in a K8S Cluster

I have a K8S cluster which is also managing VMs via virtlet. This K8S cluster is running K8S v1.13.2, with prometheus and the prometheus-adapter, and a custom-metrics server. I have written a custom metrics exporter for libvirtd which pulls in VM metrics and have configured prometheus to scrape that exporter for those VM metrics -- this is working and working well.
What I need to do next, is to have the prometheus-adapter push those metrics into K8S. Nothing I have done is working. Funny thing is, I can see the metrics in prometheus, but I am unable to present them to the custom metrics API.
Example metric visible in prometheus:
libvirt_cpu_stats_cpu_time_nanosecs{app="prometheus-lex",domain="virtlet-c91822c8-5e82-beta-deflect",instance="192.168.2.32:9177",job="kubernetes-pods",kubernetes_namespace="default",kubernetes_pod_name="prometheus-lex-866694b884-9z8v6",name="prometheus-lex",pod_template_hash="866694b884"}
Prometheus Adapter configuration for this metric:
- seriesQuery: 'libvirt_cpu_stats_cpu_time_nanosecs{job="kubernetes-pods", app="prometheus-lex"}'
seriesFilters: []
resource:
overrides:
kubernetes_pod_name:
resource: pod
kubernetes_namespace:
resource: namespace
name:
matches: libvirt_cpu_stats_cpu_time_nanosecs
as: libvirt_cpu_stats_cpu_time_rate
metricsQuery: rate(libvirt_cpu_stats_cpu_time_nanosecs{job="kubernetes-pods", app="prometheus-lex", <<.LabelMatchers>>}[5m])
When I query the custom metrics API, I do not see what I am looking for:
kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1|grep libvirt
returns nothing
Additionally, I can see the prometheus-adapter is able to query the series from prometheus. So I know that side of the adapter is working. I am just trying to figure out why it's not presenting them to the custom metrics server.
From the prometheus-adapter
I0220 19:12:58.442937 1 api.go:74] GET http://prometheus-server.default.svc.cluster.local:80/api/v1/series?match%5B%5D=libvirt_cpu_stats_cpu_time_nanosecs%7Bkubernetes_namespace%21%3D%22%22%2Ckubernetes_pod_name%21%3D%22%22%7D&start=1550689948.392 200 OK
Any ideas what I am missing here?
Update::
I have also tried the following new configuration, and it's still not working.
- seriesQuery: 'libvirt_cpu_stats_cpu_time_nanosecs{kubernetes_namespace!="",kubernetes_pod_name!=""}'
seriesFilters: []
resource:
overrides:
kubernetes_namespace: {resource: "namespace"}
kubernetes_pod_name: {resource: "pod"}
name:
matches: 'libvirt_cpu_stats_cpu_time_nanosecs'
as: 'libvirt_cpu_stats_cpu_time_rate'
metricsQuery: 'sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)'
It actually depends on how you install the Prometheus Adapter. If you install via helm and use the YAML as configuration to the rules. You need to follow this README https://github.com/helm/charts/blob/master/stable/prometheus-adapter/README.md and and declare the rules like
rules:
custom:
- seriesQuery: '{__name__=~"^some_metric_count$"}'
resources:
template: <<.Resource>>
name:
matches: ""
as: "my_custom_metric"
metricsQuery: sum(<<.Series>>{<<.LabelMatchers>>}) by (<<.GroupBy>>)
Pay attention to the custom keyword. If you miss it, the number won't be available via custom metrics.