How to monitor disk usage of persistent volumes? - kubernetes

I want to monitor disk usages of persistent volumes in the cluster. I am using CoreOS Kube Prometheus. A dashboard is trying to query with a metric called kubelet_volume_stats_capacity_bytes which is not available anymore with Kubernetes versions starting from v1.12.
I am using Kubernetes version v1.13.4 and hostpath-provisioner to provision volumes based on persistent volume claim. I want to access current disk usage metrics for each persistent volume.
kube_persistentvolumeclaim_resource_requests_storage_bytes is available but it shows only the persistent claim request in bytes
container_fs_usage_bytes is not fully covers my problem.

Per-PVC disk space usage in percentage can be determined with the following query:
100 * sum(kubelet_volume_stats_used_bytes) by (persistentvolumeclaim)
/
sum(kubelet_volume_stats_capacity_bytes) by (persistentvolumeclaim)
The kubelet_volume_stats_used_bytes metric shows per-PVC disk space usage in bytes.
The kubelet_volume_stats_capacity_bytes metric shows per-PVC disk size in bytes.

Yes, in newest version of Kubernetes you cannot monitor metric such as kubelet_volume_stats_capacity_bytes, but there are some workarounds. Unfortunately this is a bit fragmented in Kubernetes today. PVCs may have capacity and usage metrics, depending on the volume provider, and it seems that any CSI based volume doesn't have these at all. We can do this on a best effort basis butit is simple to quickly hit cases where these metrics are not available.
First, just simply write your own script which will be every time values of metric like container_fs_usage_bytes are gathered will be count difference between capacity before measurement and container usage in bytes (metric will container_fs_usage_bytes be helpful).
Prometheus is quite popular solution but to monitor capacity especially disk usage you can use Heapster, now he is about to "retire", but just for this special case you can use it, but you will have to implement script too. Take look on repository:
heapster-memory
"res.Containers = append(res.Containers,
metrics.ContainerMetrics{Name: c.Name, Usage: usage})"
I hope it helps.

I have a job like the following in my prom config:
- job_name: 'kubernetes-nodes'
kubernetes_sd_configs:
- role: node
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics
With this job in place I see the following metrics available in Prometheus:
kubelet_volume_stats_available_bytes
kubelet_volume_stats_capacity_bytes
kubelet_volume_stats_inodes
kubelet_volume_stats_inodes_free
kubelet_volume_stats_inodes_used
kubelet_volume_stats_used_bytes
More here: https://github.com/google/cadvisor/issues/1702

I have a prometheus exporter for monitoring pvc usage and provides the mapping between pod and pvc. If you are interested you can try it.
https://github.com/kais271/pvc-exporter
Prometheus metrics:
pvc_usage
pvc_mapping

Related

How to add windows exporter to Prometheus server in Kubernetes(AKS)?

I have deployed an AKS cluster with 2 node pools i.e. windows and linux. I am trying to add a monitoring solution to it with promethues and grafana. I am using node exporter for linux node metrics and windows exporter (wmi exporter) for windows node metrics. All the workloads are up and running. I can see the metrics by doing port forwarding on both linux and windows node.
The problem is while integrating it with prometheus node exporter is showing me up and running but windows exporter is down.
The entry which I made in prometheus.yml for both is
- job_name: 'win-exporter'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_endpoints_name]
regex: 'win-node-exporter'
action: keep
- job_name: 'node-exporter'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_endpoints_name]
regex: 'node-exporter'
action: keep
regex represent the endpoint names. I don't know why it is not working there. Both node exporter and windows exporter are daemonset.
workload running are
Can anyone help for how to make the windows exporter entry in prometheus.yml file.
Thank you in advance.

Prometheus Alert Manager for Federation

We have several clusters where our applications are running. We would like to set up a Central Monitoring cluster which can scrape metrics from rest of cluster using Prometheus Federation.
So to do that, I need to install prometheus server in each of cluster and install prometheus server via federation in central cluster.I will install Grafana as well in central cluster to visualise the metrics that we gather from rest of prometheus server.
So the question is;
Where should I setup the Alert Manager? Only for Central Cluster or each cluster has to be also alert manager?
What is the best practice alerting while using Federation?
I though ı can use ingress controller to expose each prometheus server? What is the best practice to provide communication between prometheus server and federation in k8s?
Based on this blog
Where should I setup the Alert Manager? Only for Central Cluster or each cluster has to be also alert manager?
What is the best practice alerting while using Federation?
The answer here would be to do that on each cluster.
If the data you need to do alerting is moved from one Prometheus to another then you've added an additional point of failure. This is particularly risky when WAN links such as the internet are involved. As far as is possible, you should try and push alerting as deep down the federation hierarchy as possible. For example an alert about a target being down should be setup on the Prometheus scraping that target, not a global Prometheus which could be several steps removed.
I though ı can use ingress controller to expose each prometheus server? What is the best practice to provide communication between prometheus server and federation in k8s?
I think that depends on use case, in each doc I checked they just use targets in scrape_configs.static_configs in the prometheus.yml
like here
scrape_configs:
- job_name: 'federate'
scrape_interval: 15s
honor_labels: true
metrics_path: '/federate'
params:
'match[]':
- '{job="prometheus"}'
- '{__name__=~"job:.*"}'
static_configs:
- targets:
- 'source-prometheus-1:9090'
- 'source-prometheus-2:9090'
- 'source-prometheus-3:9090'
OR
like here
prometheus.yml:
rule_files:
- /etc/config/rules
- /etc/config/alerts
scrape_configs:
- job_name: 'federate'
scrape_interval: 15s
honor_labels: true
metrics_path: '/federate'
params:
'match[]':
- '{job="prometheus"}'
- '{__name__=~"job:.*"}'
static_configs:
- targets:
- 'prometheus-server:80'
Additionally, worth to check how they did this in this tutorial, where they used helm to build central monitoring cluster with two prometheus servers on two clusters.

Can prometheus scrape targets together?

I need Prometheus to scrape several mongodb exporters one after another in order to compute a valid replication lag. However, the targets are scraped with a difference of several dozen seconds between them, which makes replication lag impossible to compute.
The job yaml is below:
- job_name: mongo-storage
honor_timestamps: true
scrape_interval: 1m
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
static_configs:
- targets:
- mongo-1a-exporter.monitor:9216
- mongo-2a-exporter.monitor:9216
- mongo-3a-exporter.monitor:9216
- mongos-exporter.monitor:9216
- mongo-1b-exporter.monitor:9216
- mongo-2b-exporter.monitor:9216
- mongo-3b-exporter.monitor:9216
labels:
cluster: mongo-storage
This isn't possible, Prometheus makes no guarantees about the phase of scrapes or rule evaluations. Nor is this something you should depend upon, as it'd be very fragile.
I'd aim for knowing the lag within a scrape interval, rather than trying to get it perfect. You generally care if replication is completely broken, rather than if it's slightly delayed. A heartbeat job could also help.
This isn't possible with Prometheus... normally.
However it might be possible to exploit the prometheus/pushgateway to achieve what you want. My thinking is that you write a script/tool to scrape the mongo exporters in a synchronised way, threads/forks/whatever, and then push those metrics into a prometheus/pushgateway instance.
Then configure prometheus to scrape the prometheus/pushgateway instead of the mongo exporters, and since all the metrics are in the one endpoint they will hopefully always be in sync and avoid any lag regarding being up to date.
Hope this helps.

Is there a way to make a kubernetes service name set as Prometheus Job name automatically?

How can I make my Kubernetes service name set as the Prometheus Job name automatically? I mean to say that is there a possible way to get a new service created in K8s made automatically as a target in Prometheus configuration? In Kubernetes, I will like to deploy my application as set of services.
For every service there could be more than 1 pod associated.
MApping could be done like:
Kubernetes services to Prometheus Jobs
K8s Pods to instances in Prometheus Job
But I really don't know if this is feasible with some Configuration changes in Prometheus. Please correct me if I am wrong anywhere.
If this is not possible, do I need to write create explicitly Prometheus job in the Prometheus Configuration file every time before deployment.
You will typically want metrics per pod, as you would normally have when using regular nodes instead of containers/pods.
Using this Prometheus configuration you will get a target for every pod that's running on the cluster automatically. This is the important part
# Example scrape config for pods
#
# The relabeling allows the actual pod scrape endpoint to be configured via the
# following annotations:
#
# * `prometheus.io/scrape`: Only scrape pods that have a value of `true`
# * `prometheus.io/path`: If the metrics path is not `/metrics` override this.
# * `prometheus.io/port`: Scrape the pod on the indicated port instead of the
# pod's declared ports (default is a port-free target if none are declared).
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
As explained in the comments above, this is configured so pods that contains the prometheus.io/scrape set to true, will be scraped by Prometheus, becoming a target. Pods will then need to have a metrics endpoint exposing metrics in the Prometheus format. You can use the prometheus.io/path and prometheus.io/port to configure where Prometheus will look for the metrics on your pod.

Prometheus + Heapster

I saw there is no sink configuration for Prometheus in this heapster document. Is there any simple way to combine these two and monitor.
Prometheus uses a pull model to retrieve the data, while Heapster is tool, which pushes their metrics to a certain endpoint (pull model).
I assume you want to get Kubernetes metrics into Prometheus. You don't need heapster for that, since the cadvicor has an Prometheus endpoint which can be scraped directly. Also the kubelet itself provides some metrics.
The Prometheus config would look like this:
- job_name: 'kubernetes-nodes'
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- job_name: 'kubernetes-cadvisor'
kubernetes_sd_configs:
- role: node
relabel_configs:
- source_labels: [__meta_kubernetes_node_address_InternalIP]
target_label: __address__
regex: (.*)
replacement: $1:4194
Assuming you are using the default cadvisort port 4194. Also Prometheus should be able to detect the correct kubelet port.
Additional Note: The job for scraping cAdvisor is only required when using a Kubernetes version >= 1.7. Before that the cAdvisor metrics accidentally got exposed via the Kubelet.