I have a service running in a k8s cluster, which I want to monitor using Prometheus Operator. The service has a /metrics endpoint, which returns simple data like:
myapp_first_queue_length 12
myapp_first_queue_processing 2
myapp_first_queue_pending 10
myapp_second_queue_length 4
myapp_second_queue_processing 4
myapp_second_queue_pending 0
The API runs in multiple pods, behind a basic Service object:
apiVersion: v1
kind: Service
metadata:
name: myapp-api
labels:
app: myapp-api
spec:
ports:
- port: 80
name: myapp-api
targetPort: 80
selector:
app: myapp-api
I've installed Prometheus using kube-prometheus, and added a ServiceMonitor object:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: myapp-api
labels:
app: myapp-api
spec:
selector:
matchLabels:
app: myapp-api
endpoints:
- port: myapp-api
path: /api/metrics
interval: 10s
Prometheus discovers all the pods running instances of the API, and I can query those metrics from the Prometheus graph. So far so good.
The issue is, those metrics are aggregate - each API instance/pod doesn't have its own queue, so there's no reason to collect those values from every instance. In fact it seems to invite confusion - if Prometheus collects the same value from 10 pods, it looks like the total value is 10x what it really is, unless you know to apply something like avg.
Is there a way to either tell Prometheus "this value is already aggregate and should always be presented as such" or better yet, tell Prometheus to just scrape the values once via the internal load balancer for that service, rather than hitting each pod?
edit
The actual API is just a simple Deployment object:
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-api
labels:
app: myapp-api
spec:
replicas: 2
selector:
matchLabels:
app: myapp-api
template:
metadata:
labels:
app: myapp-api
spec:
imagePullSecrets:
- name: mysecret
containers:
- name: myapp-api
image: myregistry/myapp:2.0
ports:
- containerPort: 80
volumeMounts:
- name: config
mountPath: "app/config.yaml"
subPath: config.yaml
volumes:
- name: config
configMap:
name: myapp-api-config
In your case to avoid metrics aggregation you can use, as already mentioned in your post, avg() operator to or PodMonitor instead of ServiceMonitor.
The PodMonitor custom resource definition (CRD) allows to
declaratively define how a dynamic set of pods should be monitored.
Which pods are selected to be monitored with the desired configuration
is defined using label selections.
This way it will scrape the metrics from the specified pod only.
Prometheus Operator developers are kindly working (as of Jan 2023) on a generic ScrapeConfig CRD that is designed to solve exactly the use case you describe: https://github.com/prometheus-operator/prometheus-operator/issues/2787
In the meantime, you can use the "additional scrape config" facility of Prometheus Operator to setup a custom scrape target.
The idea is that the configured scrape target will be hit only once per scrape period and the service DNS will load-balance the request to one of the N pods behind the service, thus avoiding duplicate metrics.
In particular, you can override the kube-prometheus-stack Helm values as follows:
prometheus:
prometheusSpec:
additionalScrapeConfigs:
- job_name: 'myapp-api-aggregates':
metrics_path: '/api/metrics'
scheme: 'http'
static_configs:
- targets: ['myapp-api:80']
Related
I'm configuring Traefik Proxy to run on a GKE cluster to handle proxying to various microservices. I'm doing everything through their CRDs and deployed Traefik to the cluster using a custom deployment. The Traefik dashboard is accessible and working fine, however when I try to setup an IngressRoute for the service itself, it is not accessible and it does not appear in the dashboard. I've tried setting it up with a regular k8s Ingress object and when doing that, it did appear in the dashboard, however I ran into some issues with middleware, and for ease-of-use I'd prefer to go the CRD route. Also, the deployment and service for the microservice seem to be deploying fine, they both appear in the GKE dashboard and are running normally. No ingress is created, however I'm unsure of if a custom CRD IngressRoute is supposed to create one or not.
Some information about the configuration:
I'm using Kustomize to handle overlays and general data
I have a setting through kustomize to apply the namespace users to everything
Below are the config files I'm using, and the CRDs and RBAC are defined by calling
kubectl apply -f https://raw.githubusercontent.com/traefik/traefik/v2.9/docs/content/reference/dynamic-configuration/kubernetes-crd-definition-v1.yml
kubectl apply -f https://raw.githubusercontent.com/traefik/traefik/v2.9/docs/content/reference/dynamic-configuration/kubernetes-crd-rbac.yml
deployment.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: users-service
spec:
replicas: 1
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
type: RollingUpdate
template:
metadata:
labels:
app: users-service
spec:
containers:
- name: users-service
image: ${IMAGE}
imagePullPolicy: IfNotPresent
ports:
- name: web
containerPort: ${HTTP_PORT}
readinessProbe:
httpGet:
path: /ready
port: web
initialDelaySeconds: 10
periodSeconds: 2
envFrom:
- secretRef:
name: users-service-env-secrets
service.yml
apiVersion: v1
kind: Service
metadata:
name: users-service
spec:
ports:
- name: web
protocol: TCP
port: 80
targetPort: web
selector:
app: users-service
ingress.yml
apiVersion: traefik.containo.us/v1alpha1
kind: Middleware
metadata:
name: users-stripprefix
spec:
stripPrefix:
prefixes:
- /userssrv
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
name: users-service-ingress
spec:
entryPoints:
- service-port
routes:
- kind: Rule
match: PathPrefix(`/userssrv`)
services:
- name: users-service
namespace: users
port: service-port
middlewares:
- name: users-stripprefix
If any more information is needed, just lmk. Thanks!
A default Traefik installation on Kubernetes creates two entrypoints:
web for http access, and
websecure for https access
But you have in your IngressRoute configuration:
entryPoints:
- service-port
Unless you have explicitly configured Traefik with an entrypoint named "service-port", this is probably your problem. You want to remove the entryPoints section, or specify something like:
entryPoints:
- web
If you omit the entryPoints configuration, the service will be available on all entrypoints. If you include explicit entrypoints, then the service will only be available on those specific entrypoints (e.g. with the above configuration, the service would be available via http:// and not via https://).
Not directly related to your problem, but if you're using Kustomize, consider:
Drop the app: users-service label from the deployment, the service selector, etc, and instead set that in your kustomization.yaml using the commonLabels directive.
Drop the explicit namespace from the service specification in your IngressRoute and instead use kustomize's namespace transformer to set it (this lets you control the namespace exclusively from your kustomization.yaml).
I've put together a deployable example with all the changes mentioned in this answer here.
Can somebody explain to me what is logic, or how should i proceed with following problem. I have Prometheus CR with following ServiceMonitor selector.
Name: k8s
Namespace: monitoring
Labels: prometheus=k8s
Annotations: <none>
API Version: monitoring.coreos.com/v1
Kind: Prometheus
...
Service Monitor Namespace Selector:
Service Monitor Selector:
...
Prometheus is capable of discovering all serviceMonitors it created, but it does not discover mine (newly created). Is the upper code supposed to match everything, or do you know about how to accomplish this (that is to match every single ServiceMonitor) ?
example of mine ServiceMonitor
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: example-app
namespace: monitoring
labels:
# release: prometheus
# team: frontend
spec:
selector:
matchLabels:
app: example-app
namespaceSelector:
# matchNames:
# - default
matchNames:
- e
endpoints:
- port: web
Rest of details
I know that i can discover it with something like this but this would require change in all of other monitors.
serviceMonitorSelector:
matchLabels:
team: frontend
I don't want install Prometheus operator using helm, so instead I installed it from https://github.com/prometheus-operator/kube-prometheus#warning.
If you just want to discover all serviceMonitors on a given cluster, that prometheus and prometheus operator have access to with their respective RBAC, you can use and empty selector like so:
serviceMonitorNamespaceSelector: {}
serviceMonitorSelector: {}
I'm trying to make Prometheus find metrics from RabbitMQ (and a few other services, but the logic is the same).
My current configuration is:
# RabbitMQ Service
# This live in the `default` namespace
kind: Service
apiVersion: v1
metadata:
name: rabbit-metrics-service
labels:
name: rabbit-metrics-service
spec:
ports:
- protocol: TCP
port: 15692
targetPort: 15692
selector:
# This selects the deployment and it works
app: rabbitmq
I then created a ServiceMonitor:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
# The name of the service monitor
name: rabbit-monitor
# The namespace it will be in
namespace: kube-prometheus-stack
labels:
# How to find this service monitor
# The name I should use in `serviceMonitorSelector`
name: rabbit-monitor
spec:
endpoints:
- interval: 5s
port: metrics
path: /metrics
# The namespace of origin service
namespaceSelector:
matchNames:
- default
selector:
matchLabels:
# Where the monitor will attach to
name: rabbit-metrics-service
kube-prometheus-stack has the following values.yml
# values.yml
prometheusSpec:
serviceMonitorSelector:
matchLabels:
name:
- rabbit-monitor
So from what I understand is: in the metadata/label section I define a labelKey/labelValue pair, and then reference this pair on the selector/matchLabels. I then add a custom serviceMonitorSelector that will match N labels. If it finds the labels, Prometheus should discover the ServiceMonitor, and hence, the metrics endpoint, and start scraping. But I guess there's something wrong with this logic. I tried a few other variations of this as well, but no success.
Any ideas on what I might be doing wrong?
Documentation usually uses the same name everywhere, so I can quite understand where exactly should that name come from, since I tend to prefer to add -service, -deployment suffixes to the resources to be able to easily identify them later. I already add RabbitMQ prometheus plugin, and the endpoint seems to be working fine.
I have a ready-made Kubernetes cluster with configured grafana + prometheus(operator) monitoring.
I added the following labels to pods with my app:
prometheus.io/scrape: "true"
prometheus.io/path: "/my/app/metrics"
prometheus.io/port: "80"
But metrics don't get into Prometheus. However, prometheus has all the default Kubernetes metrics.
What is the problem?
You should create ServiceMonitor or PodMonitor objects.
ServiceMonitor which describes the set of targets to be monitored by Prometheus. The Operator automatically generates Prometheus scrape configuration based on the definition and the targets will have the IPs of all the pods behind the service.
Example:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: example-app
labels:
team: frontend
spec:
selector:
matchLabels:
app: example-app
endpoints:
- port: web
PodMonitor, which declaratively specifies how groups of pods should be monitored. The Operator automatically generates Prometheus scrape configuration based on the definition.
Example:
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: example-app
labels:
team: frontend
spec:
selector:
matchLabels:
app: example-app
podMetricsEndpoints:
- port: web
I'm running a 3 node k8s cluster where I have a prometheus instance that scrape my services metrics using object like serviceMonitor. In addition to my cluster services metrics, I would like to get metrics from an ambari-server. I know that it is possible to import ambari metrics as a datasource in grafana but I haven't seen anything about Prometheus. My goal would be to have those metrics in Prometheus to set alerts with AlertManager. I also saw that it is possible to write a go program that would expose metrics in the Prometheus format, but I preferred to ask to the community before trying stuffs.. :-)
If anyone had this issue, i would be happy if you cna help !
You can, Prometheus allows to scrape metrics exposed by external services: obviously they need to respect Prometheus format but I saw that it's possible.
You have to create static endpoints and related service:
apiVersion: v1
kind: Endpoints
metadata:
name: ambari-metrics
labels:
app: ambari
subsets:
- addresses:
- ip: "ambari-external-ip"
ports:
- name: metrics
port: 9100
protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
name: ambari-metrics-svc
namespace: your-monitoring-namespace
labels:
app: ambari
spec:
type: ExternalName
externalName: ambari-external-ip
clusterIP: ""
ports:
- name: metrics
port: 9100
protocol: TCP
targetPort: 9100
And finally, the ServiceMonitor:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: ambari-metrics-sm
labels:
app: ambari
prometheus: kube-prometheus
spec:
selector:
matchLabels:
app: ambari
namespaceSelector:
matchNames:
your-monitoring-namespace
endpoints:
- port: metrics
interval: 30s
Obviously, YMMV and can find a detailed guide on this article