Is there a way to specify a specific port using Prometheus relabel_configIs?
I've deployed a Prometheus helm chart to my Kubernetes cluster, and it works fine, except for a small issue.
kubernetes-service-endpoints (6/8 up)
Looking at the endpoints that are down, I have narrowed the issue to this block of my Prometheus configuration.
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
It looks like if the prometheus.io/port annotation is declared in the service definition, then the defined port is used to replace the port in __address__
My cluster is deployed using Kops and the kube-dns service appears to have these annotations out of the box
prometheus.io/port: "9153"
prometheus.io/scrape: "true"
But on the backend pods, it's prometheus.io/port: 10055
And because of this particular relabel_configs definition block, port 10055 is being replaced with 9153, and I get the error
Get "http://100.119.59.4:9153/metrics": dial tcp 100.119.59.4:9153: connect: connection refused
Is there a way to get Prometheus to use port 10055 instesd of 9153?
Related
I have configured Prometheus on one of the kubernetes cluster nodes using [this][1]. After that I added following prometheus.yml file. I can list nodes and apiservers but for pods, all the pods shows down and error:
Get "https:// xx.xx.xx:443 /metrics": dial tcp xx.xx.xx:443: connect: connection refused and for some pods the status is unknown.
Can someone point me what am I doing wrong here?
Cat prometheus.yml
global:
scrape_interval: 1m
scrape_configs:
- job_name: 'prometheus'
scrape_interval: 5s
static_configs:
- targets: \['localhost:9090'\]
# metrics for default/kubernetes api's from the kubernetes master
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
bearer_token_file: /dfgdjk/token
api_server: https://masterapi.com:3343
tls_config:
insecure_skip_verify: true
tls_config:
insecure_skip_verify: true
bearer_token_file: /dfgdjk/token
scheme: https
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: \[__meta_kubernetes_namespace\]
action: replace
target_label: kubernetes_namespace
- source_labels: \[__meta_kubernetes_pod_name\]
action: replace
target_label: kubernetes_pod_name
# metrics for default/kubernetes api's from the kubernetes master
- job_name: 'kubernetes-apiservers'
kubernetes_sd_configs:
- role: endpoints
api_server: https://masterapi.com:3343
bearer_token_file: /dfgdjk/token
tls_config:
insecure_skip_verify: true
tls_config:
insecure_skip_verify: true
bearer_token_file: /dfgdjk/token
scheme: https
relabel_configs:
- source_labels: \[__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name\]
action: keep
regex: default;kubernetes;https][1]
[1]: https://devopscube.com/install-configure-prometheus-linux/
It's impossible to get metrics to external prometheus server without having any prometheus components inside the kubernetes cluster. This happens because cluster network is isolated with host's network and it's not possible to scrape metrics from pods directly from outside the cluster.
Please refer to Monitoring kubernetes with prometheus from outside of k8s cluster GitHub issue
There options which can be done:
install prometheus inside the cluster using prometheus operator or manually - example
use proxy solutions, for example this one from the same thread on GitHub - k8s-prometheus-proxy
on top of the prometheus installed within the cluster, it's possible to have external prometheus in federation so all logs are saved outside of the cluster. Please refer to prometheus federation.
Also important part is kube state metrics should be installed as well in kubernetes cluster. How to set it up.
Edit: also you can refer to another SO question/answer which confirms that only with additional steps or OP resolved it by another proxy solution.
I have a kubernetes cluster(built with Typhoon module) and a Prometheus instance in different VPC(running on docker-compose, not on Kubernetes cluster). I have the vpc peering connection enabled and required ports are open to this vpc. All the metrics are being scraped as expected except for coredns pod. The issue here is coredns pods are assigned with 10.2.. IP which is different from my IP range configured for the pods to run.
If coredns pod gets the IP 172...*, my prometheus will be able to resolve it and the scraping will be successful.
Now, I'm not sure how to scrape this metrics. Please let me know if you know what am I doing wrong.
$ kubectl get pods -n kube-system -o wide | grep coredns
coredns-7d8995c4cd-4l4ft 1/1 Running 1 7d1h 10.2.5.2 ip-172-*-*-* <none> <none>
coredns-7d8995c4cd-vxd9d 1/1 Running 1 6d3h 10.2.3.9 ip-172-*-*-* <none> <none>
Prometheus.yml file is configured with the below job.
- job_name: 'kubernetes-service-endpoints'
kubernetes_sd_configs:
- role: endpoints
api_server: https://kubernetes-cluster:6443
tls_config:
insecure_skip_verify: true
bearer_token: "TOKEN"
bearer_token: "TOKEN"
honor_labels: true
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: pod
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: job
metric_relabel_configs:
- source_labels: [__name__]
action: drop
regex: etcd_(debugging|disk|request|server).*
P.S: I'm using Flannel as my network CNI so that I get the pods created with the IP of the host network itself.
Updated Info:
I tried deploying the prometheus on kubernetes and trying to federate this data to my prometheus docker as suggested by Yaron.
I'm trying the below config for the federation but not seeing any metrics loaded to my target prometheus.
- job_name: 'federate'
scrape_interval: 10s
honor_labels: true
metrics_path: '/federate'
params:
'match[]':
- '{job="prometheus"}'
- '{job="kubernetes-nodes"}'
- '{job="kubernetes-apiservers"}'
- '{job="kubernetes-service-endpoints"}'
- '{job="kubernetes-cadvisor"}'
- '{job="kubelet"}'
- '{job="etcd"}'
- '{job="kubernetes-services"}'
- '{job="kubernetes-pods"}'
scheme: https
static_configs:
- targets:
- prom.mycompany.com
The best practice for solving this issue is running a prometheus instance inside the cluster running Coredns, and federating the metrics scraped by that prometheus into your external prometheus running with docker-compose.
You can read more about federation here, to get an idea of how to start leveraging it.
A more advanced use case would be using Thanos to better distribute queries across your different prometheus servers, but the main point remains running an internal prometheus server within each of your clusters.
I'm trying to setup a monitoring of our Kubernetes cluster but it's not that easy. In the first time I tried on a dedicated VM to scrap all metrics following configs I can find on Internet and prometheus.io but I read several time it's not the best way to do it. I found a suggestion to use kube-state-metrics, it's done, the pod is running and metrics are reachable from outside (Azure infra). so http://xxx.xxx.xxx.xxx:8080/metrics is showing me a correct result.
When I add this to the config:
- job_name: 'Kubernetes-Nodes'
scheme: http
#tls_config:
#insecure_skip_verify: true
kubernetes_sd_configs:
- api_server: 'http://xxx.xxx.xxx.xxx:8080'
role: endpoints
namespaces:
names: [default]
#tls_config:
#insecure_skip_verify: true
bearer_token: %VERYLONGLINE%
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
target_label: kubernetes_name
The log I can find is :
Sep 25 06:53:59 monitoring001 prometheus[59005]: level=error ts=2018-09-25T06:53:59.636669498Z caller=main.go:234 component=k8s_client_runtime err="github.com/prometheus/prometheus/discovery/kubernetes/kubernetes.go:288: Failed to list *v1.Pod: serializer for text/html; charset=utf-8 doesn't exist"
Anyone has an idea ?
Thank you,
Finally found the issue ! My Prometheus is located on a dedicated VM outside Kubernetes cluster.
Kube-state-metrics is exposing metrics from an IP outside of the cluster,because of this, it's not necessary to scrap metrics like a kubernetes object, it's just necessary to scrap metrics like a simple target
How can I make my Kubernetes service name set as the Prometheus Job name automatically? I mean to say that is there a possible way to get a new service created in K8s made automatically as a target in Prometheus configuration? In Kubernetes, I will like to deploy my application as set of services.
For every service there could be more than 1 pod associated.
MApping could be done like:
Kubernetes services to Prometheus Jobs
K8s Pods to instances in Prometheus Job
But I really don't know if this is feasible with some Configuration changes in Prometheus. Please correct me if I am wrong anywhere.
If this is not possible, do I need to write create explicitly Prometheus job in the Prometheus Configuration file every time before deployment.
You will typically want metrics per pod, as you would normally have when using regular nodes instead of containers/pods.
Using this Prometheus configuration you will get a target for every pod that's running on the cluster automatically. This is the important part
# Example scrape config for pods
#
# The relabeling allows the actual pod scrape endpoint to be configured via the
# following annotations:
#
# * `prometheus.io/scrape`: Only scrape pods that have a value of `true`
# * `prometheus.io/path`: If the metrics path is not `/metrics` override this.
# * `prometheus.io/port`: Scrape the pod on the indicated port instead of the
# pod's declared ports (default is a port-free target if none are declared).
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: kubernetes_namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: kubernetes_pod_name
As explained in the comments above, this is configured so pods that contains the prometheus.io/scrape set to true, will be scraped by Prometheus, becoming a target. Pods will then need to have a metrics endpoint exposing metrics in the Prometheus format. You can use the prometheus.io/path and prometheus.io/port to configure where Prometheus will look for the metrics on your pod.
I saw there is no sink configuration for Prometheus in this heapster document. Is there any simple way to combine these two and monitor.
Prometheus uses a pull model to retrieve the data, while Heapster is tool, which pushes their metrics to a certain endpoint (pull model).
I assume you want to get Kubernetes metrics into Prometheus. You don't need heapster for that, since the cadvicor has an Prometheus endpoint which can be scraped directly. Also the kubelet itself provides some metrics.
The Prometheus config would look like this:
- job_name: 'kubernetes-nodes'
kubernetes_sd_configs:
- role: node
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_node_label_(.+)
- job_name: 'kubernetes-cadvisor'
kubernetes_sd_configs:
- role: node
relabel_configs:
- source_labels: [__meta_kubernetes_node_address_InternalIP]
target_label: __address__
regex: (.*)
replacement: $1:4194
Assuming you are using the default cadvisort port 4194. Also Prometheus should be able to detect the correct kubelet port.
Additional Note: The job for scraping cAdvisor is only required when using a Kubernetes version >= 1.7. Before that the cAdvisor metrics accidentally got exposed via the Kubelet.