Cannot see the target added to service monitor for Prometheus Operator - service

I am trying to set up to add the target to my service monitor for Prometheus Operator (inside my terraform that is using helm chart to deploy prometheus, prometheus operator and service monitor and a bunch of stuff).
After I successfully deployed service monitor, I cannot see the new target app.kubernetes.io/instance: jobs-manager in prometheus. I am not sure what I did wrong in my configuration. I am also checking this document to see what is missing but cannot figure it out yet.
Here are some configuration files concerned:
/helm/charts/prometheus-abcd/templates/service_monitor.tpl
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: jobs-manager-servicemonitor
# Change this to the namespace the Prometheus instance is running in
namespace: prometheus
labels:
app: jobs-manager
release: prometheus
spec:
selector:
matchLabels:
app.kubernetes.io/instance: jobs-manager # Targets jobs-manager service
endpoints:
- port: http
interval: 15s
/helm/charts/prometheus-abcd/Chart.yaml
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#
apiVersion: v1
appVersion: "1.0.0"
description: Prometheus Service monitor, customized for abcd
name: prometheus-abcd
version: 1.0.0
/terraform/kubernetes/helm_values/prometheus.yaml
prometheus:
podMetadata:
annotations:
container.apparmor.security.beta.kubernetes.io/prometheus-operator: runtime/default
seccomp.security.alpha.kubernetes.io/pod: runtime/default
nodeAffinityPreset:
## Node affinity type
## Allowed values: soft, hard
##
type: "hard"
## Node label key to match
## E.g.
## key: "kubernetes.io/e2e-az-name"
##
key: "cloud.google.com/gke-nodepool"
## Node label values to match
## E.g.
## values:
## - e2e-az1
## - e2e-az2
##
values: [
"abcd-primary-pool"
]
prometheus:
configMaps:
- prometheus-config
## ServiceMonitors to be selected for target discovery.
## If {}, select all ServiceMonitors
##
serviceMonitorSelector: {
jobs-manager-servicemonitor
}
# matchLabels:
# foo: bar
## Namespaces to be selected for ServiceMonitor discovery.
## See https://github.com/prometheusoperator/prometheusoperator/blob/master/
## Documentation/api.md#namespaceselector for usage
##
serviceMonitorNamespaceSelector: {
matchNames: prometheus
}
When running this command:
kubectl get -n prometheus prometheuses.monitoring.coreos.com prometheus-kube-prometheus-prometheus
I can see that the service monitor was successfully deployed:
But when I run this command:
kubectl describe -n prometheus prometheuses.monitoring.coreos.com prometheus-kube-prometheus-prometheus
I see that many parameters still have missing values such as serviceMonitorSelector
Name: prometheus-kube-prometheus-prometheus
Namespace: prometheus
Labels: app.kubernetes.io/component=prometheus
app.kubernetes.io/instance=prometheus
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=kube-prometheus
helm.sh/chart=kube-prometheus-3.4.0
Annotations: meta.helm.sh/release-name: prometheus
meta.helm.sh/release-namespace: prometheus
API Version: monitoring.coreos.com/v1
Kind: Prometheus
Metadata:
Creation Timestamp: 2021-05-26T15:19:42Z
Generation: 1
Managed Fields:
API Version: monitoring.coreos.com/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.:
f:meta.helm.sh/release-name:
f:meta.helm.sh/release-namespace:
f:labels:
.:
f:app.kubernetes.io/component:
f:app.kubernetes.io/instance:
f:app.kubernetes.io/managed-by:
f:app.kubernetes.io/name:
f:helm.sh/chart:
f:spec:
.:
f:affinity:
.:
f:podAntiAffinity:
.:
f:preferredDuringSchedulingIgnoredDuringExecution:
f:alerting:
.:
f:alertmanagers:
f:configMaps:
f:enableAdminAPI:
f:externalUrl:
f:image:
f:listenLocal:
f:logFormat:
f:logLevel:
f:paused:
f:podMetadata:
.:
f:labels:
.:
f:app.kubernetes.io/component:
f:app.kubernetes.io/instance:
f:app.kubernetes.io/name:
f:podMonitorNamespaceSelector:
f:podMonitorSelector:
f:probeNamespaceSelector:
f:probeSelector:
f:replicas:
f:retention:
f:routePrefix:
f:ruleNamespaceSelector:
f:ruleSelector:
f:securityContext:
.:
f:fsGroup:
f:runAsUser:
f:serviceAccountName:
f:serviceMonitorNamespaceSelector:
f:serviceMonitorSelector:
Manager: Go-http-client
Operation: Update
Time: 2021-05-26T15:19:42Z
Resource Version: 11485229
Self Link: /apis/monitoring.coreos.com/v1/namespaces/prometheus/prometheuses/prometheus-kube-prometheus-prometheus
UID: xxxxxxxxxxxxxxxxxxxx
Spec:
Affinity:
Pod Anti Affinity:
Preferred During Scheduling Ignored During Execution:
Pod Affinity Term:
Label Selector:
Match Labels:
app.kubernetes.io/component: prometheus
app.kubernetes.io/instance: prometheus
app.kubernetes.io/name: kube-prometheus
Namespaces:
prometheus
Topology Key: kubernetes.io/hostname
Weight: 1
Alerting:
Alertmanagers:
Name: prometheus-kube-prometheus-alertmanager
Namespace: prometheus
Path Prefix: /
Port: http
Config Maps:
prometheus-config
Enable Admin API: false
External URL: http://prometheus-kube-prometheus-prometheus.prometheus:9090/
Image: docker.io/bitnami/prometheus:2.24.0-debian-10-r1
Listen Local: false
Log Format: logfmt
Log Level: info
Paused: false
Pod Metadata:
Labels:
app.kubernetes.io/component: prometheus
app.kubernetes.io/instance: prometheus
app.kubernetes.io/name: kube-prometheus
Pod Monitor Namespace Selector:
Pod Monitor Selector:
Probe Namespace Selector:
Probe Selector:
Replicas: 1
Retention: 10d
Route Prefix: /
Rule Namespace Selector:
Rule Selector:
Security Context:
Fs Group: 1001
Run As User: 1001
Service Account Name: prometheus-kube-prometheus-prometheus
Service Monitor Namespace Selector:
Service Monitor Selector:
Events: <none>
This is why I check this document to get the template for serviceMonitorSelector and also serviceMonitorNamespaceSelector and added them to the prometheus.yaml file above but not sure if it is correctly added.
Anyone with experience setting up service monitor with helm and terraform, could you please help me check what I did wrong? Thank you in advance.

the way you have passed value in prometheus.yaml is wrong
serviceMonitorNamespaceSelector: {
matchNames: prometheus
} #this is wrong way
you should have to set the values like :
serviceMonitorNamespaceSelector:
matchLabels:
prometheus: somelabel
also same for
serviceMonitorSelector: {
jobs-manager-servicemonitor
}
it's not set proper way.
for reference please check : https://github.com/prometheus-community/helm-charts/blob/83aa113f52e5f45fd04b4dd909172a6da1826592/charts/kube-prometheus-stack/values.yaml#L2034
checkout this nice example : https://rtfm.co.ua/en/kubernetes-a-clusters-monitoring-with-the-prometheus-operator/
Prometheus operator with Terraform & helm : https://github.com/OpenQAI/terraform-helm-release-prometheus-operator

Related

kube-state-metrics not sending metrics using service monitor

I have deployed kube-state-metrics into kube-system namespace and in the same cluster we are having prometheus-operator running I've written the below service monitor file for sending metrics to prometheus but it is not working. Please find the files below.
Servicemonitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: kube-state-metrics
labels:
app.kubernetes.io/name: kube-state-metrics
namespace: kube-system
spec:
selector:
matchLabels:
prometheus-scrape: "true"
endpoints:
- port: metrics
path: /metrics
targetPort: 8080
honorLabels: true
scheme: https
tlsConfig:
insecureSkipVerify: true
Prometheus-deploy.yaml
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
annotations:
argocd.argoproj.io/sync-wave: "1"
name: prometheus
labels:
name: prometheus
spec:
serviceAccountName: prometheus
serviceMonitorSelector: {}
serviceMonitorNamespaceSelector:
matchLabels:
prometheus-scrape: "true"
podMonitorSelector: {}
podMonitorNamespaceSelector:
matchLabels:
prometheus-scrape: "true"
resources:
requests:
memory: 400Mi
enableAdminAPI: false
additionalScrapeConfigs:
name: additional-scrape-configs
key: prometheus-additional.yaml
Can any one please help me out regarding this issue.
Thanks.
ServiceMonitor's selector>matchLabels should match with "Service"'s labels. Check if your service has correct label.

ServiceMonitors discovery (Label Selector and Namespace Selector) in Prometheus CR specification

Can somebody explain to me what is logic, or how should i proceed with following problem. I have Prometheus CR with following ServiceMonitor selector.
Name: k8s
Namespace: monitoring
Labels: prometheus=k8s
Annotations: <none>
API Version: monitoring.coreos.com/v1
Kind: Prometheus
...
Service Monitor Namespace Selector:
Service Monitor Selector:
...
Prometheus is capable of discovering all serviceMonitors it created, but it does not discover mine (newly created). Is the upper code supposed to match everything, or do you know about how to accomplish this (that is to match every single ServiceMonitor) ?
example of mine ServiceMonitor
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: example-app
namespace: monitoring
labels:
# release: prometheus
# team: frontend
spec:
selector:
matchLabels:
app: example-app
namespaceSelector:
# matchNames:
# - default
matchNames:
- e
endpoints:
- port: web
Rest of details
I know that i can discover it with something like this but this would require change in all of other monitors.
serviceMonitorSelector:
matchLabels:
team: frontend
I don't want install Prometheus operator using helm, so instead I installed it from https://github.com/prometheus-operator/kube-prometheus#warning.
If you just want to discover all serviceMonitors on a given cluster, that prometheus and prometheus operator have access to with their respective RBAC, you can use and empty selector like so:
serviceMonitorNamespaceSelector: {}
serviceMonitorSelector: {}

Exposing Traefik dashboard

I know there are many similar posts. But given the slight differences between everyone's environment, I have not found a solution that has worked for me. I am trying to access the Traefik dashboard running on bare metal (pi cluster) k3s. I am using the default LB in k3s.
Other ingress provider resources have worked like ingress to the Pihole dashboard for example. When I try to access the dashboard via: https://www.traefik.localhost/dashboard/ I get an unable to connect error. I have traefik.localhost in /etc/hosts pointing to one of the LB ingress IP's, in this case .104
In theory I think the request should be gobbled up by the LB service on the respective node, forwarded to the traefik service, if the entrypoint is open (80) in this case. The Traefik service should look at the providers available, find the ingressRoute I've made, and match the hostname. Then forward the request to the service api#internal. I do not know how to check if that service is running properly or not, which would be the last step in my debugging process if I knew how.
Here is the Traefik service:
kubectl describe service -n kube-system traefik
Name: traefik
Namespace: kube-system
Labels: app.kubernetes.io/instance=traefik
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=traefik
helm.sh/chart=traefik-10.3.0
Annotations: meta.helm.sh/release-name: traefik
meta.helm.sh/release-namespace: kube-system
Selector: app.kubernetes.io/instance=traefik,app.kubernetes.io/name=traefik
Type: LoadBalancer
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.43.226.223
IPs: 10.43.226.223
LoadBalancer Ingress: 192.168.4.101, 192.168.4.102, 192.168.4.103, 192.168.4.104, 192.168.4.105
Port: web 80/TCP
TargetPort: web/TCP
NodePort: web 30690/TCP
Endpoints: 10.42.4.88:8000
Port: websecure 443/TCP
TargetPort: websecure/TCP
NodePort: websecure 30328/TCP
Endpoints: 10.42.4.88:8443
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>
Here is the IngressRoute:
kubectl describe ingressRoute -n kube-system dashboard
Name: dashboard
Namespace: kube-system
Labels: <none>
Annotations: <none>
API Version: traefik.containo.us/v1alpha1
Kind: IngressRoute
Metadata:
Creation Timestamp: 2022-01-18T03:42:49Z
Generation: 9
Managed Fields:
API Version: traefik.containo.us/v1alpha1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.:
f:kubectl.kubernetes.io/last-applied-configuration:
f:spec:
.:
f:entryPoints:
f:routes:
Manager: kubectl-client-side-apply
Operation: Update
Time: 2022-01-23T16:46:30Z
Resource Version: 628002
UID: b96eb707-b1a9-4a6c-b94f-a8b975b4120b
Spec:
Entry Points:
web
Routes:
Kind: Rule
Match: Host(`traefik.localhost`) && PathPrefix(`/`)
Services:
Kind: TraefikService
Name: api#internal
Events: <none>
Dynamic config:
cat traefik.yaml
---
apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
name: traefik-crd
namespace: kube-system
spec:
chart: https://%{KUBERNETES_API}%/static/charts/traefik-crd-10.3.0.tgz
---
apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
name: traefik
namespace: kube-system
spec:
chart: https://%{KUBERNETES_API}%/static/charts/traefik-10.3.0.tgz
api:
insecure: true
set:
global.systemDefaultRegistry: ""
valuesContent: |-
rbac:
enabled: true
ports:
websecure:
tls:
enabled: true
podAnnotations:
prometheus.io/port: "8082"
prometheus.io/scrape: "true"
providers:
kubernetesCRD:
kubernetesIngress:
publishedService:
enabled: true
priorityClassName: "system-cluster-critical"
image:
name: "rancher/mirrored-library-traefik"
tolerations:
- key: "CriticalAddonsOnly"
operator: "Exists"
- key: "node-role.kubernetes.io/control-plane"
operator: "Exists"
effect: "NoSchedule"
- key: "node-role.kubernetes.io/master"
operator: "Exists"
effect: "NoSchedule"
helm chart config to modify the above helm chart:
cat traefik-config.yaml
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
name: traefik
namespace: kube-system
spec:
api:
insecure: true
dashboard: true
What can I try to solve this?
I also have k3s with the preinstalled traefik and default loadbalancer.
This works for me.
If you implement this, you only need to type traefik.localhost in your browser and it should get you into your dashboard. No need to add /dashboard to your url.
The middleware will convert your http request into a https request.
apiVersion: traefik.containo.us/v1alpha1
kind: Middleware
metadata:
name: redirectscheme
namespace: default
spec:
redirectScheme:
scheme: https
permanent: true
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
name: traefik-dash-http
namespace: default
spec:
entryPoints:
- web
routes:
- match: Host(`traefik.localhost`)
kind: Rule
services:
- name: api#internal
kind: TraefikService
middlewares:
- name: redirectscheme
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
name: traefik-dash-https
namespace: default
spec:
entryPoints:
- websecure
routes:
- match: Host(`traefik.localhost`)
kind: Rule
services:
- name: api#internal
kind: TraefikService

monitoring.cereos.com/v1 servicemonitor resource name may not be empty

I am trying to follow this instruction to monitoring my prometheus
https://logiq.ai/scraping-nginx-ingress-controller-metrics-using-helm-prometheus/
anyhow, I got a problem when trying to apply this file configuration
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
app: kubernetes-ingress
name: service-monitor
namespace: nginx-ingress
spec:
endpoints:
- interval: 30s
path: /metrics
port: prometheus
namespaceSelector:
matchNames:
- logiq
selector:
matchLabels:
app: kubernetes-ingress
this is the error
error: error when retrieving current configuration of:
Resource: "monitoring.coreos.com/v1, Resource=servicemonitors", GroupVersionKind: "monitoring.coreos.com/v1, Kind=ServiceMonitor"
Name: "", Namespace: "default"
from server for: "servicemonitor.yaml": resource name may not be empty
I thought it was about the CRD, but my monitoring.coreos.com has installed.
thank you in advance
this is my prometheus-kube clusterrole
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
annotations:
meta.helm.sh/release-name: prometheus
meta.helm.sh/release-namespace: ingress
creationTimestamp: "2022-01-17T03:09:49Z"
generation: 1
labels:
app: kube-prometheus-stack-prometheus
app.kubernetes.io/managed-by: Helm
chart: kube-prometheus-stack-10.1.3
heritage: Helm
release: prometheus
name: prometheus-kube-prometheus-prometheus
namespace: ingress
resourceVersion: "2311107"
uid: 48a57afb-2d9a-4f9f-9885-33ca66c59b16
spec:
alerting:
alertmanagers:
- apiVersion: v2
name: prometheus-kube-prometheus-alertmanager
namespace: ingress
pathPrefix: /
port: web
baseImage: quay.io/prometheus/prometheus
enableAdminAPI: false
externalUrl: http://prometheus-kube-prometheus-prometheus.ingress:9090
listenLocal: false
logFormat: logfmt
logLevel: info
paused: false
podMonitorNamespaceSelector: {}
podMonitorSelector:
matchLabels:
release: prometheus
portName: web
probeNamespaceSelector: {}
probeSelector:
matchLabels:
release: prometheus
replicas: 1
retention: 10d
routePrefix: /
ruleNamespaceSelector: {}
ruleSelector:
matchLabels:
app: kube-prometheus-stack
release: prometheus
securityContext:
fsGroup: 2000
runAsGroup: 2000
runAsNonRoot: true
runAsUser: 1000
serviceAccountName: prometheus-kube-prometheus-prometheus
serviceMonitorNamespaceSelector: {}
serviceMonitorSelector:
matchLabels:
release: prometheus
version: v2.21.0
For k8s resources, metadata.name is a required field. You must provide the metadata.name in resource YAML before applying it.
In case of metadata.namespace, if you don't provide it, it defaults to default namespace.
I think you have some unwanted leading spaces before name and namespace fields.
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: service-monitor
namespace: nginx-ingress
labels:
app: kubernetes-ingress
spec:
endpoints:
- interval: 30s
path: /metrics
port: prometheus
namespaceSelector:
matchNames:
- logiq
selector:
matchLabels:
app: kubernetes-ingress
Update:
In your Prometheus CR, you have serviceMonitorSelector set.
spec:
serviceMonitorSelector:
matchLabels:
release: prometheus
Add these labels to your serviceMonitor CR.
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: service-monitor
namespace: nginx-ingress
labels:
app: kubernetes-ingress
release: prometheus
Or, you can also update serviceMonitorSelector from the Prometheus CR side.

Prometheus custom metric does not appear in custom.metrics kubernetes

I configure all of the following configurations but the request_per_second does not appear when I type the command
kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1
In the node.js that should be monitored I installed prom-client, I tested the /metrics and it's working very well and the metric "resquest_count" is the object it returns
Here are the important parts of that node code
(...)
const counter = new client.Counter({
name: 'request_count',
help: 'The total number of processed requests'
});
(...)
router.get('/metrics', async (req, res) => {
res.set('Content-Type', client.register.contentType)
res.end(await client.register.metrics())
})
This is my service monitor configuration
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: un1qnx-validation-service-monitor-node
namespace: default
labels:
app: node-request-persistence
release: prometheus
spec:
selector:
matchLabels:
app: node-request-persistence
endpoints:
- interval: 5s
path: /metrics
port: "80"
bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
namespaceSelector:
matchNames:
- un1qnx-aks-development
This the node-request-persistence configuration
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: node-request-persistence
namespace: un1qnx-aks-development
name: node-request-persistence
spec:
selector:
matchLabels:
app: node-request-persistence
template:
metadata:
annotations:
prometheus.io/scrape: "true"
prometheus.io/path: /metrics
prometheus.io/port: "80"
labels:
app: node-request-persistence
spec:
containers:
- name: node-request-persistence
image: node-request-persistence
imagePullPolicy: Always # IfNotPresent
resources:
requests:
memory: "200Mi" # Gi
cpu: "100m"
limits:
memory: "400Mi"
cpu: "500m"
ports:
- name: node-port
containerPort: 80
This is the prometheus adapter
prometheus:
url: http://prometheus-server.default.svc.cluster.local
port: 9090
rules:
custom:
- seriesQuery: 'request_count{namespace!="", pod!=""}'
resources:
overrides:
namespace: {resource: "namespace"}
pod: {resource: "pod"}
name:
as: "request_per_second"
metricsQuery: "round(avg(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>))"
This is the hpa
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: un1qnx-validation-service-hpa-angle
namespace: un1qnx-aks-development
spec:
minReplicas: 1
maxReplicas: 10
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: un1qnx-validation-service-angle
metrics:
- type: Pods
pods:
metric:
name: request_per_second
target:
type: AverageValue
averageValue: "5"
The command
kubectl get hpa -n un1qnx-aks-development
results in "unknown/5"
Also, the command
kubectl get --raw "http://prometheus-server.default.svc.cluster.local:9090/api/v1/series"
Results in
Error from server (NotFound): the server could not find the requested resource
I think it should return some value about the collected metrics... I think that the problem is from the service monitor, but I am new to this
As you noticed I am trying to scale a deployment based on another deployment pods, don't know if there is a problem there
I appreciate an answer, because this is for my thesis
kubernetes - version 1.19.9
Prometheus - chart prometheus-14.2.1 app version 2.26.0
Prometheus Adapter - chart 2.14.2 app version 0.8.4
And all where installed using helm
After some time I found the problems and I changed the following
Changed the port on the prometheus adapter, the time on the query and the names of the resource override. But to know the names of the resources override you need to port forward to the prometheus server and check the labels on the targets page of the app that you are monitoring.
prometheus:
url: http://prometheus-server.default.svc.cluster.local
port: 80
rules:
custom:
- seriesQuery: 'request_count{kubernetes_namespace!="", kubernetes_pod_name!=""}'
resources:
overrides:
kubernetes_namespace: {resource: "namespace"}
kubernetes_pod_name: {resource: "pod"}
name:
matches: "request_count"
as: "request_count"
metricsQuery: "round(avg(rate(<<.Series>>{<<.LabelMatchers>>}[5m])) by (<<.GroupBy>>))"
I also added annotations on the deployment yaml
spec:
selector:
matchLabels:
app: node-request-persistence
template:
metadata:
annotations:
prometheus.io/scrape: "true"
prometheus.io/path: /metrics
prometheus.io/port: "80"
labels: