How to add custom labels in Promtail Config - grafana

I need to Extract logs data and append as a new label, below is the sample log example:
Sample Log Message:
2022-12-21T11:48:00,001 [schedulerFactor_Worker-4, , ] INFO [,,] [userAgent=] [system=,component=,object=] [,] [] c.s.f.s.scheduler.SchedulerTask - sync process started on 2022-12-21T06:48:00.000780 for sync pair :17743b1b-a067-4478-a6d8-7b1cff04175a for JobId :dc8dc0dd-fdb9-4873-af55-1c70ba2047a5
New Labels needed in logs:
sync_pair =17743b1b-a067-4478-a6d8-7b1cff04175a
JobId =dc8dc0dd-fdb9-4873-af55-1c70ba2047a5
My promtail-config.yml
server:
http_listen_port: 9080
grpc_listen_port: 0
http_listen_address: 0.0.0.0
positions:
filename: /tmp/positions.yaml
clients:
- url: http://loki:3100/loki/api/v1/push
scrape_configs:
- job_name: system
static_configs:
- targets:
- localhost
labels:
job: varlogs
__path__: /var/log/*log
- job_name: containers
static_configs:
- targets:
- localhost
labels:
job: containers
__path__: /var/lib/docker/containers/*/*log
- job_name: dockerlogs
file_sd_configs:
- files:
- /etc/promtail/promtail-targets.yml
relabel_configs:
- source_labels: [job]
target_label: job
- source_labels: [__address__]
target_label: container_id
- source_labels: [container_id]
target_label: __path__
replacement: /var/lib/docker/containers/*/*log
pipeline_stages:
- match:
selector: '{job="varlogs"}'
stages:
- regex:
expression: '(?P<sync_pair>sync_pair)' '(?P<job_id>job_id)'
- labels:
sync_pair:
job_id:
I have added pipeline stages, but it's not showing any labels. i.e. sync_pair and JobId, this labels should be shown in logs after query.
2022-12-21T11:48:00,001 [schedulerFactor_Worker-4, , ] INFO [,,] [userAgent=] [system=,component=,object=] [,] [] c.s.f.s.scheduler.SchedulerTask - sync process started on 2022-12-21T06:48:00.000780 for sync pair :17743b1b-a067-4478-a6d8-7b1cff04175a for JobId :dc8dc0dd-fdb9-4873-af55-1c70ba2047a5
This two must be shown in Log Labels:
**sync_pair =17743b1b-a067-4478-a6d8-7b1cff04175a
JobId =dc8dc0dd-fdb9-4873-af55-1c70ba2047a5
**
Check Image ---> https://i.stack.imgur.com/1h27v.png
I want sync_pair and JobId in my Labels.

promtail-config.yml
server:
http_listen_port: 9080
grpc_listen_port: 0
http_listen_address: 0.0.0.0
positions:
filename: /tmp/positions.yaml
clients:
- url: http://loki:3100/loki/api/v1/push
scrape_configs:
- job_name: system
static_configs:
- targets:
- localhost
labels:
job: varlogs
__path__: /var/log/*log
pipeline_stages:
- regex:
# extracts only sync pair and JobId from the log line
expression: '.*sync pair :(?P<syncpair>[a-zA-Z0-9_-]{30,36}).*JobId :(?P<jobid>[a-zA-Z0-9_-]{30,36})'
- labels:
# sources extracted syncpair as label 'sync pair' value and jobid as label 'JobId'
syncpair:
jobid:
- job_name: containers
static_configs:
- targets:
- localhost
labels:
job: containers
__path__: /var/lib/docker/containers/*/*log
- job_name: dockerlogs
file_sd_configs:
- files:
- /etc/promtail/promtail-targets.yml
relabel_configs:
- source_labels: [job]
target_label: job
- source_labels: [__address__]
target_label: container_id
- source_labels: [container_id]
target_label: __path__
replacement: /var/lib/docker/containers/*/*log
Try this tool -> https://regex101.com/r/rgp49r/1
Reference -> Using Promtail to sum log line values - Pipeline Stages - Metrics
Check output image for Labels

Related

How to configure Rabbitmq Metric in Prometheus/Grafana Kubernetes

My question i want to add rabbitmq monitoring in prometheus. I already have rabbitmq running in Kubernetes but i dont know how to add rabbitmq metric in prometheus
I have install promethues and grafana through yaml file along with pv,pvc,storage,svc,config,deploy and cluster-role
Here is the screenshot of rabbitmq showing empty in promethues
I have install Kubernetes in one vm with local storage i.e.., control-plane and node both install in one vm and everything is working fine
Here is my prometheus-config yaml file
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-config
namespace: monitoring
data:
prometheus.yml: |
global:
scrape_interval: 5s
evaluation_interval: 5s
rule_files:
- /etc/prometheus/prometheus.rules
alerting:
alertmanagers:
- scheme: http
static_configs:
- targets:
- "alertmanager.monitoring.svc:9093"
scrape_configs:
- job_name: 'kubernetes-apiservers'
kubernetes_sd_configs:
- role: endpoints
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: default;kubernetes;https
- job_name: 'kubernetes-nodes'
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
- job_name: 'kube-state-metrics'
static_configs:
- targets: ['kube-state-metrics.kube-system.svc.cluster.local:8080']
- job_name: 'kubernetes-cadvisor'
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
- job_name: 'kubernetes-service-endpoints'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name
- job_name: 'rabbitmq'
metrics_path: /metrics
scrape_interval: 5s
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- default
relabel_configs:
- source_labels: [__meta_kubernetes_service_label_app]
separator: ;
regex: rabbitmq
replacement: $1
action: keep
- source_labels: [__meta_kubernetes_endpoint_port_name]
separator: ;
regex: prometheus
Even after adding rabbitmq metric it is not showing in prometheus url (target)
here is my rabbitmq-svc yaml file
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: xxx-rabbitmq
spec:
selector:
matchLabels:
app: xxx-rabbitmq
serviceName: xxx-rabbitmq
replicas: 1
template:
metadata:
labels:
app: xxx-rabbitmq
spec:
terminationGracePeriodSeconds: 10
containers:
- name: xxx-rabbitmq
image: rabbitmq:3.7.3-management
ports:
- containerPort: xxx
- containerPort: xxx
- containerPort: xxx
- containerPort: xxx
volumeMounts:
- name: xxx-rabbitmq-pvc
mountPath: /var/lib/rabbitmq
subPath: rabbitmq
envFrom:
- configMapRef:
name: rabbitmq-config
volumeClaimTemplates:
- metadata:
name: xxx-rabbitmq-pvc
spec:
accessModes: ["ReadWriteOnce"]
volumeMode: Filesystem
storageClassName: xxxx-storage
resources:
requests:
storage: 10Gi
---
apiVersion: v1
kind: Service
metadata:
name: xxx-rabbitmq
labels:
app: xxx-rabbitmq
spec:
type: ClusterIP
ports:
- port: xxx
name: main
- port: xxx
name: rabbitmqssl
- port: xxx
name: rabvitmqmgmt
selector:
app: xxx-rabbitmq
Please help me out how to get all information of rabbitmq in promethues/grafana
If you use rabbitmq of version 3.8 and above you have to use rabbitmq_prometheus plugin.
As the docs state:
The plugin exposes all RabbitMQ metrics on a dedicated TCP port, in
Prometheus text format.
To enable it run:
rabbitmq-plugins enable rabbitmq_prometheus
To validate it's working run inside rabbitmq container:
curl -s localhost:15692/metrics | head -n 3
and see if you get metrics in response.
For versions prior to 3.8, you'd have to use prometheus_rabbitmq_exporter

failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:monitoring" cannot list resource "pods" in API group "" at the cluster scope

Not sure what I am missing. Please, find below all the config scripts I have used
2022-07-21T07:26:56.903Z info service/collector.go:220 Starting otelcol... {"service": "my-prom-instance", "Version": "0.54.0", "NumCPU": 4}
2022-07-21T07:26:56.903Z info service/collector.go:128 Everything is ready. Begin running and processing data. {"service": "my-prom-instance"}
2022-07-21T07:26:56.902Z debug discovery/manager.go:309 Discoverer channel closed {"service": "my-prom-instance", "kind": "receiver", "name": "prometheus", "pipeline": "metrics", "provider": "static/0"}
W0721 07:26:56.964183 1 reflector.go:324] k8s.io/client-go#v0.24.2/tools/cache/reflector.go:167: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:monitoring:otel-collector-collector" cannot list resource "pods" in API group "" at the cluster scope
E0721 07:26:56.964871 1 reflector.go:138] k8s.io/client-go#v0.24.2/tools/cache/reflector.go:167: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:monitoring:otel-collector-collector" cannot list resource "pods" in API group "" at the cluster scope
W0721 07:26:58.435237 1 reflector.go:324] k8s.io/client-go#v0.24.2/tools/cache/reflector.go:167: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:monitoring:otel-collector-collector" cannot list resource "pods" in API group "" at the cluster scope
E0721 07:26:58.435924 1 reflector.go:138]
clusterRole.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: prometheus
namespace: monitoring
rules:
- apiGroups: [""]
resources:
- nodes
- nodes/proxy
- services
- endpoints
- pods
verbs: ["get", "list", "watch"]
- apiGroups:
- extensions
resources:
- ingresses
verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: prometheus
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: prometheus
subjects:
- kind: ServiceAccount
name: default
namespace: monitoring
config-map.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-server-conf
labels:
name: prometheus-server-conf
namespace: monitoring
data:
prometheus.rules: |-
groups:
- name: devopscube demo alert
rules:
- alert: High Pod Memory
expr: sum(container_memory_usage_bytes) > 1
for: 1m
labels:
severity: slack
annotations:
summary: High Memory Usage
prometheus.yml: |-
global:
scrape_interval: 5s
evaluation_interval: 5s
rule_files:
- /etc/prometheus/prometheus.rules
alerting:
alertmanagers:
- scheme: http
static_configs:
- targets:
- "alertmanager.monitoring.svc:9093"
scrape_configs:
- job_name: 'node-exporter'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_endpoints_name]
regex: 'node-exporter'
action: keep
- job_name: 'kubernetes-apiservers'
kubernetes_sd_configs:
- role: endpoints
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: default;kubernetes;https
- job_name: 'kubernetes-nodes'
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
- job_name: 'kube-state-metrics'
static_configs:
- targets: ['kube-state-metrics.kube-system.svc.cluster.local:8080']
- job_name: 'kubernetes-cadvisor'
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
- job_name: 'kubernetes-service-endpoints'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name
prometheus-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus-deployment
namespace: monitoring
labels:
app: prometheus-server
spec:
replicas: 1
selector:
matchLabels:
app: prometheus-server
template:
metadata:
labels:
app: prometheus-server
spec:
containers:
- name: prometheus
image: prom/prometheus
args:
- "--config.file=/etc/prometheus/prometheus.yml"
- "--storage.tsdb.path=/prometheus/"
ports:
- containerPort: 9090
volumeMounts:
- name: prometheus-config-volume
mountPath: /etc/prometheus/
- name: prometheus-storage-volume
mountPath: /prometheus/
volumes:
- name: prometheus-config-volume
configMap:
defaultMode: 420
name: prometheus-server-conf
- name: prometheus-storage-volume
emptyDir: {}
otel-deployment.yaml
apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
name: otel-collector
namespace: monitoring
spec:
config: |
receivers:
prometheus:
config:
scrape_configs:
- job_name: 'kube-state-metrics'
scrape_interval: 5s
scrape_timeout: 1s
static_configs:
- targets: ['kube-state-metrics.kube-system.svc.cluster.local:8080']
- job_name: k8s
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
regex: "true"
action: keep
metric_relabel_configs:
- source_labels: [__name__]
regex: "(request_duration_seconds.*|response_duration_seconds.*)"
action: keep
processors:
batch:
exporters:
logging:
service:
pipelines:
metrics:
receivers: [prometheus]
exporters: [logging]
telemetry:
logs:
level: debug
initial_fields:
service: my-prom-instance
otel-service.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: otel-collector-sa
namespace: monitoring
the service account is defined with name otel-collector-sa, and your ClusterRoleBinding link to service account default

Promtail multiline does not merge stacktrace

Promtail, Grafana, Loki version is 2.4.1. running is Kubernetes.
I was following the documentation.
The exception in the log matches the regular expression. (ZeroWidthSpace is at the beginning of a log line)
multiline stage is set see the attached configuration (promtail.yaml)
I was expecting that error stacktrace will be in a single entry in grafana/loki but every line is a separate entry. Am I missing some configs?
# cat /etc/promtail/promtail.yaml
server:
log_level: info
http_listen_port: 3101
client:
url: http://***-loki:3100/loki/api/v1/push
positions:
filename: /run/promtail/positions.yaml
scrape_configs:
# See also https://github.com/grafana/loki/blob/master/production/ksonnet/promtail/scrape_config.libsonnet for reference
- job_name: kubernetes-pods
pipeline_stages:
- multiline:
firstline: ^\x{200B}\[
max_lines: 128
max_wait_time: 3s
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels:
- __meta_kubernetes_pod_controller_name
regex: ([0-9a-z-.]+?)(-[0-9a-f]{8,10})?
action: replace
target_label: __tmp_controller_name
- source_labels:
- __meta_kubernetes_pod_label_app_kubernetes_io_name
- __meta_kubernetes_pod_label_app
- __tmp_controller_name
- __meta_kubernetes_pod_name
regex: ^;*([^;]+)(;.*)?$
action: replace
target_label: app
- source_labels:
- __meta_kubernetes_pod_label_app_kubernetes_io_component
- __meta_kubernetes_pod_label_component
regex: ^;*([^;]+)(;.*)?$
action: replace
target_label: component
- action: replace
source_labels:
- __meta_kubernetes_pod_node_name
target_label: node_name
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
replacement: $1
separator: /
source_labels:
- namespace
- app
target_label: job
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- action: replace
source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- action: replace
replacement: /var/log/pods/*$1/*.log
separator: /
source_labels:
- __meta_kubernetes_pod_uid
- __meta_kubernetes_pod_container_name
target_label: __path__
- action: replace
regex: true/(.*)
replacement: /var/log/pods/*$1/*.log
separator: /
source_labels:
- __meta_kubernetes_pod_annotationpresent_kubernetes_io_config_hash
- __meta_kubernetes_pod_annotation_kubernetes_io_config_hash
- __meta_kubernetes_pod_container_name
target_label: __path__
It turned out that the logs look different than what we see in lens pod logs or kubectl logs {pod}.
The original logs consumed by promtail can be found on the host machine:
minikube ssh
cat /var/log/pods/{namespace}_{pod}/{container}/0.log
They look something like this:
{"log":"​[default-nioEventLoopGroup-1-1] INFO HTTP_ACCESS_LOGGER - \"GET /health/readiness HTTP/1.1\" 200 523\n","stream":"stdout","time":"2021-12-17T12:26:29.702621198Z"}
So, the firstline regexp did not match any log line. Unfortunatly, there are no errors about this in the promtail logs.
This is the docker log format and there is a pipeline stage to parse this:
- docker: {}
Additionally there was an issue in the logs. There were extra line breaks in the multi line stacktrace, so this additional pipeline stage filters them:
- replace:
expression: '(\n)'
replace: ''
So my working config looks like this:
server:
log_level: info
http_listen_port: 3101
client:
url: http://***-loki:3100/loki/api/v1/push
positions:
filename: /run/promtail/positions.yaml
scrape_configs:
# See also https://github.com/grafana/loki/blob/master/production/ksonnet/promtail/scrape_config.libsonnet for reference
- job_name: kubernetes-pods
pipeline_stages:
- docker: {}
- multiline:
firstline: ^\x{200B}\[
max_lines: 128
max_wait_time: 3s
- replace:
expression: (\n)
replace: ""
#config continues below (not copied)

Monitor only one namespace metrics - Prometheus with Kubernetes

I am implementing Prometheus to monitor my Kubernetes system health, where I have multiple clusters and namespaces.
My goal is to monitor only a specefic namespace which called default and just my own pods excluding prometheus Pods and monitoring details.
I tried to specify the namespace in the kubernetes_sd_configs like this:
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- 'default'
Buit I still getting metrics that I don't need.
Here is my configMap.yml:
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-server-conf
labels:
name: prometheus-server-conf
namespace: default
data:
prometheus.rules: |-
groups:
- name: devopscube demo alert
rules:
- alert: High Pod Memory
expr: sum(container_memory_usage_bytes) > 1
for: 1m
labels:
severity: slack
annotations:
summary: High Memory Usage
prometheus.yml: |-
global:
scrape_interval: 5s
evaluation_interval: 5s
rule_files:
- /etc/prometheus/prometheus.rules
alerting:
alertmanagers:
- scheme: http
static_configs:
- targets:
- "alertmanager.monitoring.svc:9093"
scrape_configs:
- job_name: 'kubernetes-apiservers'
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- 'default'
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: default;kubernetes;https
- job_name: 'kubernetes-nodes'
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
namespaces:
names:
- 'default'
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
namespaces:
names:
- 'default'
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
- job_name: 'kube-state-metrics'
static_configs:
- targets: ['kube-state-metrics.kube-system.svc.cluster.local:8080']
- job_name: 'kubernetes-cadvisor'
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
namespaces:
names:
- 'default'
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- target_label: __address__
replacement: kubernetes.default.svc:443
- source_labels: [__meta_kubernetes_node_name]
regex: (.+)
target_label: __metrics_path__
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
- job_name: 'kubernetes-service-endpoints'
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- 'default'
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name
I don't want to this details below for example to be monitored:
✔container_memory_rss{beta_kubernetes_io_arch="amd64",beta_kubernetes_io_os="linux",id="/system.slice/kubelet.service",instance="minikube",job="kubernetes-cadvisor",kubernetes_io_arch="amd64",kubernetes_io_hostname="minikube",kubernetes_io_os="linux"}
✔container_memory_rss{beta_kubernetes_io_arch="amd64",beta_kubernetes_io_os="linux",id="/system.slice/docker.service",instance="minikube",job="kubernetes-cadvisor",kubernetes_io_arch="amd64",kubernetes_io_hostname="minikube",kubernetes_io_os="linux"}
✔container_memory_rss{beta_kubernetes_io_arch="amd64",beta_kubernetes_io_os="linux",id="/kubepods/podda7b74d8-b611-4dff-885c-70ea40091b7d",instance="minikube",job="kubernetes-cadvisor",kubernetes_io_arch="amd64",kubernetes_io_hostname="minikube",kubernetes_io_os="linux",namespace="kube-system",pod="default-http-backend-59f7ff8999-ktqnl",pod_name="default-http-backend-59f7ff8999-ktqnl"}
If you just want to prevent certain metrics from being ingested (i.e. prevent from being saved in the Prometheus database), you can use metric relabelling to drop them:
- job_name: kubernetes-cadvisor
metric_relabel_configs:
- source_labels: [__name__]
regex: container_memory_rss
action: drop
Note that in the kubernetes-cadvisor job you use the node service discovery role. This discovers Kubernetes nodes, which are non-namespaced resources, so your namespace restriction to default might not have any effect in this case.
Hey just found this in the docs
# Optional namespace discovery. If omitted, all namespaces are used.
namespaces:
names:
[ - <string> ]
right under https://prometheus.io/docs/prometheus/latest/configuration/configuration/#ingress
this worked for me
- job_name: "kubernetes-cadvisor"
scheme: https
metrics_path: /metrics/cadvisor
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
# disable certificate verification by uncommenting the line below.
#
# insecure_skip_verify: true
authorization:
credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
metric_relabel_configs:
- action: keep
source_labels: [namespace]
regex: tsb ##namespace name you want
If you want to scrape metrics from a specific application or service, then apply the prometheus scrape annotations to only those application services that you are interested in.
sample
apiVersion: apps/v1beta2 # for versions before 1.8.0 use extensions/v1beta1
kind: DaemonSet
metadata:
name: fluentd-elasticsearch
namespace: weave
labels:
app: fluentd-logging
spec:
selector:
matchLabels:
name: fluentd-elasticsearch
template:
metadata:
labels:
name: fluentd-elasticsearch
annotations:
prometheus.io/scrape: 'true'
prometheus.io/port: '9102'
spec:
containers:
- name: fluentd-elasticsearch
image: gcr.io/google-containers/fluentd-elasticsearch:1.20
Annotations on pods allow you to control if the metrics need to be scraped or not
prometheus.io/scrape: The default configuration will scrape all pods and, if set to false, this annotation will exclude the pod from the scraping process.
prometheus.io/path: If the metrics path is not /metrics, define it with this annotation.
prometheus.io/port: Scrape the pod on the indicated port instead of the pod’s declared ports (default is a port-free target if none are declared).
Your configuration works only with Prometheus Operator.

Not able to health check of service which is outside the cluster in prometheus

I have setup prometheus and blackbox to monitor the kubenetes cluster and it is working fine for the dservice monitoring that is probe success for all services fine but the services which are from outside, I am not able to monitor those one. For ex: google.com, its giving probe_status: 0 and probe_http_redirects:0
Blackbox targets array is:
{
"targets": [ "google.com" ],
"labels": {
"job": "kubernetes-services",
"namespace": "default"
}
}
Job:
- job_name: 'kubernetes-services'
scheme: http
metrics_path: /probe
params:
module: [http_2xx]
file_sd_configs:
- files:
- /etc/prometheus/blackbox-targets.json
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
replacement: ${1}
- source_labels: [__param_target]
regex: (.*)
target_label: instance
replacement: ${1}
- source_labels: []
regex: .*
target_label: __address__
replacement: mo-blackbox.mo-system:9115
Result:
probe_ip_protocol 6
probe_http_status_code 0
probe_http_content_length 0
probe_http_redirects 0
probe_http_ssl 0
probe_duration_seconds 0.373322
probe_success 0
I want probe_success and probe_http_redirects should be 1, how can I achieve it?