I have a K8s cluster with colocated etcd deployed on-prem servers, using Kubespray. I don't see the etcd metrics getting scraped by Prometheus operator. Prometheus operator deployed using helm v3.5.4.
K8s version 1.22 , Helm chart prometheus-community/kube-prometheus-stack version 25.0.0 , 3 node control plane on CentOS 7.
Prometheus config shows a job for etcd - job_name: serviceMonitor/monitoring/kube-prometheus-kube-prome-kube-etcd/0 .
But there is no service for etcd in the list of Services for Prometheus.
There are no endpoints defined for etcd
Values.yml (updated with volumes ) for helm deployment
prometheus:
service:
type: NodePort
externalTrafficPolicy: Local
ingress:
enabled: true
annotations:
kubernetes.io/ingress.class: "custom"
hosts:
- prometheus.{{ cluster_domain }}.mydomain.com
paths:
- /
pathType: Prefix
tls:
- secretName:
prometheusSpec:
storageSpec:
volumeClaimTemplate:
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: rook-ceph-block
resources:
requests:
storage: {{ monitoring.storage_size }}
volumeMounts:
- name: cert-vol
mountPath: "/etc/prometheus/secrets/etcd-certs"
readOnly: true
volumes:
- name: cert-vol
secret:
secretName: etcd-certs
kubeEtcd:
enabled: true
endpoints:
- 172.1.1.1
- 172.1.1.2
- 172.1.1.3
service:
port: 2379
targetPort: 2379
serviceMonitor:
scheme: https
insecureSkipVerify: true
caFile: /etc/prometheus/secrets/etcd-certs/ca.crt
certFile: /etc/prometheus/secrets/etcd-certs/client.crt
keyFile: /etc/prometheus/secrets/etcd-certs/client.key
I added the endpoints to kubeEtcd section to get it to work. The updated values.yaml is like below (changed IP adresses):
prometheus:
service:
type: NodePort
externalTrafficPolicy: Local
ingress:
enabled: true
annotations:
kubernetes.io/ingress.class: "custom"
hosts:
- prometheus.{{ cluster_domain }}.mydomain.com
paths:
- /
pathType: Prefix
tls:
- secretName:
prometheusSpec:
storageSpec:
volumeClaimTemplate:
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: rook-ceph-block
resources:
requests:
storage: {{ monitoring.storage_size }}
volumeMounts:
- name: cert-vol
mountPath: "/etc/prometheus/secrets/etcd-certs"
readOnly: true
volumes:
- name: cert-vol
secret:
secretName: etcd-certs
kubeEtcd:
enabled: true
endpoints:
- 172.1.1.1
- 172.1.1.2
- 172.1.1.3
service:
port: 2379
targetPort: 2379
serviceMonitor:
scheme: https
insecureSkipVerify: true
caFile: /etc/prometheus/secrets/etcd-certs/ca.crt
certFile: /etc/prometheus/secrets/etcd-certs/client.crt
keyFile: /etc/prometheus/secrets/etcd-certs/client.key
Related
In the kube_prometheus_stack helm chart values.yaml file, I made the following changes to access the DNS records of Alertmanager, Grafana and Prometheus with ingress. Since we are currently using haproxy ingress for other services on EKS Kubernetes, I used haproxy ingress.
For Alertmanager:
alertmanager:
enabled: true
serviceAccount:
create: true
ingress:
enabled: true
annotations:
ingressClassName: haproxy
cert-manager.io/cluster-issuer: "letsencrypt"
haproxy.ingress.kubernetes.io/use-regex: "true"
haproxy.ingress.kubernetes.io/rewrite-target: /$1
labels:
app.kubernetes.io/instance: cert-manager
hosts:
- alertmanager.monitoring.new.io
tls:
- secretName: alertmanager-tls
hosts:
- alertmanager.monitoring.new.io
service:
port: 443
targetPort: 9093
For Grafana:
grafana:
enabled: true
service:
port: 443
targetPort: 3000
ingress:
enabled: true
annotations:
cert-manager.io/cluster-issuer: "letsencrypt"
kubernetes.io/ingress.class: haproxy
haproxy.ingress.kubernetes.io/use-regex: "true"
haproxy.ingress.kubernetes.io/rewrite-target: /$1
labels:
app.kubernetes.io/instance: cert-manager
hosts:
- grafana.monitoring.new.io
tls:
- secretName: test-grafana-tls
hosts:
- grafana.monitoring.new.io
For Prometheus:
prometheus:
enabled: true
serviceAccount:
create: true
ingress:
enabled: true
annotations:
ingressClassName: haproxy
cert-manager.io/cluster-issuer: "letsencrypt"
haproxy.ingress.kubernetes.io/use-regex: "true"
haproxy.ingress.kubernetes.io/rewrite-target: /$1
labels:
app.kubernetes.io/instance: cert-manager
hosts:
- prometheus.monitoring.new.io
tls:
- secretName: prometheus-tls
hosts:
- prometheus.monitoring.new.io
service:
port: 443
targetPort: 9090
I also defined DNS records on AWS Route-53 and made the necessary settings. But I can only access the Grafana WEb interface from the Grafana URL. When I try to access the Alertmanager and Prometheus web interfaces, I get a "default backend - 404" error.
Although I did a lot of research, I could not reach a clear conclusion. I wonder if I need to try a different method or where am I going wrong?
I have an app that already uses an internal ingress, but I want to expose a single endpoint to the internet. How can I do that?
My chart:
apis:
- name: api
image:
repositoryURI: URL
containerPort: 80
workload: general
service:
enabled: true
port: 80
ingress:
- enabled: true
type: internal-ms
hosts:
- hostname: example.qa
- hostname: qa.example.local
annotations:
nginx.ingress.kubernetes.io/proxy-connect-timeout: "300"
nginx.ingress.kubernetes.io/proxy-read-timeout: "300"
nginx.ingress.kubernetes.io/proxy-send-timeout: "300"
nginx.ingress.kubernetes.io/proxy-body-size: 50m
nginx.org/client-max-body-size: 50m
hpa:
...
I have an existing LoadBalancer which is configured to listen on 4443 and its mTLS TCP based backend. Now i am trying to convert this existing Load balancer setup to work via Kubernetes nginx ingress controller. Backends configured are listening on port 19000.
My existing LB setup looks like this with MTLS example of how it works.
curl -v --cacert ./certs_test/ca.crt --key ./certs_test/server.key --cert ./certs_test/server.crt -k https://10.66.49.164:4443/api/testtools/private/v1/test/testGetTenants
With nginx ingress controller i have created a private LB override default 443 port to 4443 in args section like below.
- /nginx-ingress-controller
- --publish-service=$(POD_NAMESPACE)/ingress-chart
- --election-id=ingress-controller-leader
- --ingress-class=nginx
- --configmap=$(POD_NAMESPACE)/ingress-chart
- --tcp-services-configmap=$(POD_NAMESPACE)/tcp-services
- --https-port=4443
- --default-ssl-certificate=$(POD_NAMESPACE)/tls-secret
- --enable-ssl-passthrough
I want to terminate MTLS at ingress level so that the authentication will work as expected. Below is an example of the setup which i currently have.
Clusterip service yaml
apiVersion: v1
kind: Service
metadata:
name: testtools-service
spec:
type: ClusterIP
ports:
- port: 19000
protocol: TCP
targetPort: 19000
name: https-testtools
selector:
app: testtools
Deployment yaml spec
apiVersion: apps/v1
kind: Deployment
metadata:
name: testtools-deployment
spec:
selector:
matchLabels:
app: testtools
replicas: 3
template:
metadata:
labels:
app: testtools
spec:
containers:
- name: testtools
image: tst-testtools-service-np:214
livenessProbe:
httpGet:
path: /api/testtools/private/health
port: 19001
scheme: HTTPS
initialDelaySeconds: 300
readinesProbe:
httpGet:
path: /api/testtools/private/health
port: 19001
scheme: HTTPS
initialDelaySeconds: 120
ports:
- containerPort: 19000
name: https-testtools
volumeMounts:
- mountPath: /test/logs
name: test-volume
- mountPath: /etc/pki
name: etc-service
- mountPath: /etc/environment
name: etc-env
- mountPath: /etc/availability-domain
name: etc-ad
- mountPath: /etc/identity-realm
name: etc-idr
- mountPath: /etc/region
name: etc-region
- mountPath: /etc/fault-domain
name: etc-fd
- mountPath: /etc/hostclass
name: etc-hc
- mountPath: /etc/physical-availability-domain
name: etc-pad
resources:
requests:
memory: "8000Mi"
volumes:
- name: test-volume
hostPath:
path: /test/logs
type: Directory
- name: etc-service
hostPath:
path: /etc/pki
type: Directory
- name: etc-env
hostPath:
path: /etc/environment
type: File
- name: etc-ad
hostPath:
path: /etc/availability-domain
type: File
- name: etc-idr
hostPath:
path: /etc/identity-realm
type: File
- name: etc-region
hostPath:
path: /etc/region
type: File
- name: etc-fd
hostPath:
path: /etc/fault-domain
type: File
- name: etc-hc
hostPath:
path: /etc/hostclass
type: File
- name: etc-pad
hostPath:
path: /etc/physical-availability-domain
type: File
dnsPolicy: Default
My ingress spec
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: testtools-ingress
annotations:
kubernetes.io/ingress.class: "nginx"
nginx.ingress.kubernetes.io/ssl-passthrough: "true"
#nginx.ingress.kubernetes.io/use-regex: "true"
#nginx.ingress.kubernetes.io/rewrite-target: /$1
nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
nginx.ingress.kubernetes.io/secure-backends: "true"
#nginx.ingress.kubernetes.io/auth-tls-verify-client: "on"
#nginx.ingress.kubernetes.io/auth-tls-secret: "default/tls-new-cert"
#nginx.ingress.kubernetes.io/auth-tls-verify-depth: "3"
#nginx.ingress.kubernetes.io/auth-tls-pass-certificate-to-upstream: "true"
nginx.ingress.kubernetes.io/auth-tls-secret: default/my-certs
nginx.ingress.kubernetes.io/auth-tls-verify-client: "on"
nginx.ingress.kubernetes.io/auth-tls-verify-depth: "3"
nginx.ingress.kubernetes.io/proxy-ssl-name: ingress-nginx-controller
nginx.ingress.kubernetes.io/proxy-ssl-secret: default/my-certs
nginx.ingress.kubernetes.io/proxy-ssl-verify: "on"
nginx.ingress.kubernetes.io/proxy-ssl-verify-depth: "5"
spec:
tls:
- secretName: my-certs
rules:
- http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: testtools-service
port:
number: 19000
what i am trying to achieve is do passthrough at load balancer level and terminate at ingress to ensure mtls is working as expected and api call to succeed. Want to know if my ingress setup is correct. Would like to know if this achievable using k8s ingress controller with LB as service this is straight forward but to reduce creating too many LB i was trying this setup. Termination at ingress controller is working fine but further calls just dont provide any response. Your inputs on this would help.
$ curl -v --cacert ./certs_test/ca.crt --key ./certs_test/server.key --cert ./certs_test/server.crt -k https://10.66.48.120:4443/api/testtools/private/v1/test/testGetTenants
* About to connect() to 10.66.48.120 port 4443 (#0)
* Trying 10.66.48.120...
* Connected to 10.66.48.120 (10.66.48.120) port 4443 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
* CAfile: ./certs_test/ca.crt
CApath: none
* skipping SSL peer certificate verification
* SSL connection using TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
* Server certificate:
* subject: O=Acme Co,CN=Kubernetes Ingress Controller Fake Certificate
* start date: Feb 23 18:03:52 2022 GMT
* expire date: Feb 23 18:03:52 2023 GMT
* common name: Kubernetes Ingress Controller Fake Certificate
* issuer: O=Acme Co,CN=Kubernetes Ingress Controller Fake Certificate
> GET /api/prodtools/private/v1/common/getTenants HTTP/1.1
> User-Agent: curl/7.29.0
> Host: 10.66.48.120:4443
> Accept: */*
>
* Connection #0 to host 10.66.48.120 left intact
P
$
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
namespace: istio-system
name: istio
spec:
profile: default
values:
gateways:
istio-ingressgateway:
#sds:
enabled: true
components:
base:
enabled: true
ingressGateways:
- name: istio-ingressgateway
enabled: true
namespace: test-ingress
k8s:
overlays:
- apiVersion: v1
kind: Gateway
name: micro-ingress
patches:
- path: spec.servers
value:
- port:
number: 80
name: http
protocol: HTTP
hosts:
- "*" #host should be specify in VirtualService
tls:
httpsRedirect: true
- port:
number: 443
name: https
protocol: HTTPS
hosts:
- "*"
tls:
mode: SIMPLE
credentialName: secret
privateKey: sds
serverCertificate: sds
serviceAnnotations:
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
service:
ports:
- name: status-port
port: 15020
targetPort: 15020
- name: http2
port: 80
targetPort: 8080
- name: https
port: 443
targetPort: 8443
loadBalancerIP: xx.xx.xx.xxx
resources:
limits:
cpu: 2000m
memory: 1024Mi
requests:
cpu: 100m
memory: 128Mi
istioctl manifest apply -f .\manifest.yml
This will install the Istio 1.9.0 default profile with ["Istio core" "Istiod" "Ingress gateways"] components into the cluster.
Proceed? (y/N) y
Error: failed to install manifests: errors occurred during operation: overlay for Gateway:micro-ingress does not match any object in output
manifest. Available objects are:
HorizontalPodAutoscaler:test-ingress:istio-ingressgateway
Deployment:test-ingress:istio-ingressgateway
PodDisruptionBudget:test-ingress:istio-ingressgateway
Role:test-ingress:istio-ingressgateway-sds
RoleBinding:test-ingress:istio-ingressgateway-sds
Service:test-ingress:istio-ingressgateway
ServiceAccount:test-ingress:istio-ingressgateway-service-account
I made the Kubernetes cluster with 2 azure Ubuntu VMs and trying to monitor the cluster. For that, I have deployed node-exporter daemonSet, heapster, Prometheus and grafana. Configured the node-exporter as a target in Prometheus rules files. but I am getting Get http://master-ip:30002/metrics: context deadline exceeded error. I have also increased scrape_interval and scrape_timeout values in the Prometheus-rules file.
The following are the manifest files for the Prometheus-rules file and node-exporter daemonSet and service files.
apiVersion: apps/v1
kind: DaemonSet
metadata:
labels:
app: node-exporter
name: node-exporter
namespace: kube-system
spec:
selector:
matchLabels:
app: node-exporter
template:
metadata:
labels:
app: node-exporter
spec:
containers:
- args:
- --web.listen-address=<master-IP>:30002
- --path.procfs=/host/proc
- --path.sysfs=/host/sys
- --path.rootfs=/host/root
- --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+)($|/)
- --collector.filesystem.ignored-fs-types=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs)$
image: quay.io/prometheus/node-exporter:v0.18.1
name: node-exporter
resources:
limits:
cpu: 250m
memory: 180Mi
requests:
cpu: 102m
memory: 180Mi
volumeMounts:
- mountPath: /host/proc
name: proc
readOnly: false
- mountPath: /host/sys
name: sys
readOnly: false
- mountPath: /host/root
mountPropagation: HostToContainer
name: root
readOnly: true
- args:
- --logtostderr
- --secure-listen-address=[$(IP)]:9100
- --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
- --upstream=http://<master-IP>:30002/
env:
- name: IP
valueFrom:
fieldRef:
fieldPath: status.podIP
image: quay.io/coreos/kube-rbac-proxy:v0.4.1
name: kube-rbac-proxy
ports:
- containerPort: 9100
hostPort: 9100
name: https
resources:
limits:
cpu: 20m
memory: 40Mi
requests:
cpu: 10m
memory: 20Mi
hostNetwork: true
hostPID: true
nodeSelector:
kubernetes.io/os: linux
securityContext:
runAsNonRoot: true
runAsUser: 65534
serviceAccountName: node-exporter
tolerations:
- operator: Exists
volumes:
- hostPath:
path: /proc
name: proc
- hostPath:
path: /sys
name: sys
- hostPath:
path: /
name: root
---
apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: node-exporter
name: node-exporter
namespace: kube-system
spec:
type: NodePort
ports:
- name: https
port: 9100
targetPort: https
nodePort: 30002
selector:
app: node-exporter
---prometheus-config-map.yaml-----
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-server-conf
labels:
name: prometheus-server-conf
namespace: default
data:
prometheus.yml: |-
global:
scrape_interval: 5m
evaluation_interval: 3m
scrape_configs:
- job_name: 'node'
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
static_configs:
- targets: ['<master-IP>:30002']
- job_name: 'kubernetes-apiservers'
kubernetes_sd_configs:
- role: endpoints
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: default;kubernetes;https
Can we take service as NodePort for Node-exporter daemonSet? if the answer NO, how could we configure in a prometheus-rules file as the target? Could anyone help me to understand the scenario? Are any suggestible links also fine?
As #gayahtri confirmed in comments
it worked for me. – gayathri
If you have same issue as mentioned in topic check out this github issue
specifically this answer added by #simonpasquier
We have debugged it offline and the problem was the network.
Running the Prometheus container with "--network=host" solved the issue.