Kubernetes Failed to create object: Namespace is required for v1.Endpoints - kubernetes

I'm currently trying to setup heketi on kubernetes, i need to create an endpoint like so (i'm using Ansible):
- hosts: 'masters'
remote_user: kube
become: yes
become_user: kube
vars:
ansible_python_interpreter: /usr/bin/python3
tasks:
- name: "Create gluster endpoints on kubernetes master"
kubernetes.core.k8s:
state: present
definition:
apiVersion: v1
kind: Endpoints
metadata:
name: glusterfs-cluster
labels:
storage.k8s.io/name: glusterfs
storage.k8s.io/part-of: mine
storage.k8s.io/created-by: username
subsets:
- addresses:
- ip: 10.0.0.4
hostname: gluster1
- ip: 10.0.0.5
hostname: gluster2
- ip: 10.0.0.6
hostname: gluster3
- ip: 10.0.0.7
hostname: gluster4
ports:
- port: 1
When i run ansible playbook on this i am getting this error:
Failed to create object: Namespace is required for v1.Endpoints
I can't find any information as to what it's talking about, what is the namespace supposed to be?

An Endpoints resource (like a Pod, Service, Deployment, etc) is a namespaced resource: it cannot be created globally; it must be created inside a specific namespace.
We can't answer the question, "what is the namespace supposed to be?", because generally this will be something like "the same namespace as the resources that will rely on this Endpoints resource".

Related

How to expose a service to outside Kubernetes cluster via ingress?

I'm struggling to expose a service in an AWS cluster to outside and access it via a browser. Since my previous question haven't drawn any answers, I decided to simplify the issue in several aspects.
First, I've created a deployment which should work without any configuration. Based on this article, I did
kubectl create namespace tests
created file probe-service.yaml based on paulbouwer/hello-kubernetes:1.8 and deployed it kubectl create -f probe-service.yaml -n tests:
apiVersion: v1
kind: Service
metadata:
name: hello-kubernetes-first
spec:
type: ClusterIP
ports:
- port: 80
targetPort: 8080
selector:
app: hello-kubernetes-first
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: hello-kubernetes-first
spec:
replicas: 3
selector:
matchLabels:
app: hello-kubernetes-first
template:
metadata:
labels:
app: hello-kubernetes-first
spec:
containers:
- name: hello-kubernetes
image: paulbouwer/hello-kubernetes:1.8
ports:
- containerPort: 8080
env:
- name: MESSAGE
value: Hello from the first deployment!
created ingress.yaml and applied it (kubectl apply -f .\probes\ingress.yaml -n tests)
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: hello-kubernetes-ingress
spec:
rules:
- host: test.projectname.org
http:
paths:
- pathType: Prefix
path: "/test"
backend:
service:
name: hello-kubernetes-first
port:
number: 80
- host: test2.projectname.org
http:
paths:
- pathType: Prefix
path: "/test2"
backend:
service:
name: hello-kubernetes-first
port:
number: 80
ingressClassName: nginx
Second, I can see that DNS actually point to the cluster and ingress rules are applied:
if I open http://test.projectname.org/test or any irrelevant path (http://test.projectname.org/test3), I'm shown NET::ERR_CERT_AUTHORITY_INVALID, but
if I use "open anyway" in browser, irrelevant paths give ERR_TOO_MANY_REDIRECTS while http://test.projectname.org/test gives Cannot GET /test
Now, TLS issues aside (those deserve a separate question), why can I get Cannot GET /test? It looks like ingress controller (ingress-nginx) got the rules (otherwise it wouldn't descriminate paths; that's why I don't show DNS settings, although they are described in the previous question) but instead of showing the simple hello-kubernetes page at /test it returns this simple 404 message. Why is that? What could possibly go wrong? How to debug this?
Some debug info:
kubectl version --short tells Kubernetes Client Version is v1.21.5 and Server Version is v1.20.7-eks-d88609
kubectl get ingress -n tests shows that hello-kubernetes-ingress exists indeed, with nginx class, 2 expected hosts, address equal to that shown for load balancer in AWS console
kubectl get all -n tests shows
NAME READY STATUS RESTARTS AGE
pod/hello-kubernetes-first-6f77d8ff99-gjw5d 1/1 Running 0 5h4m
pod/hello-kubernetes-first-6f77d8ff99-ptwsn 1/1 Running 0 5h4m
pod/hello-kubernetes-first-6f77d8ff99-x8w87 1/1 Running 0 5h4m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/hello-kubernetes-first ClusterIP 10.100.18.189 <none> 80/TCP 5h4m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/hello-kubernetes-first 3/3 3 3 5h4m
NAME DESIRED CURRENT READY AGE
replicaset.apps/hello-kubernetes-first-6f77d8ff99 3 3 3 5h4m
ingress-nginx was installed before me via the following chart:
apiVersion: v2
name: nginx
description: A Helm chart for Kubernetes
type: application
version: 4.0.6
appVersion: "1.0.4"
dependencies:
- name: ingress-nginx
version: 4.0.6
repository: https://kubernetes.github.io/ingress-nginx
and the values overwrites applied with the chart differ from the original ones mostly (well, those got updated since the installation) in extraArgs: default-ssl-certificate: "nginx-ingress/dragon-family-com" is uncommneted
PS To answer Andrew, I indeed tried to setup HTTPS but it seemingly didn't help, so I haven't included what I tried into the initial question. Yet, here's what I did:
installed cert-manager, currently without a custom chart: kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.5.4/cert-manager.yaml
based on cert-manager's tutorial and SO question created a ClusterIssuer with the following config:
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-backoffice
spec:
acme:
server: https://acme-staging-v02.api.letsencrypt.org/directory
# use https://acme-v02.api.letsencrypt.org/directory after everything is fixed and works
privateKeySecretRef: # this secret will be created in the namespace of cert-manager
name: letsencrypt-backoffice-private-key
# email: <will be used for urgent alerts about expiration etc>
solvers:
# TODO: add for each domain/second-level domain/*.projectname.org
- selector:
dnsZones:
- test.projectname.org
- test2.projectname.org
# haven't made it to work yet, so switched to the simpler to configure http01 challenge
# dns01:
# route53:
# region: ... # that of load balancer (but we also have ...)
# accessKeyID: <of IAM user with access to Route53>
# secretAccessKeySecretRef: # created that
# name: route53-credentials-secret
# key: secret-access-key
# role: arn:aws:iam::645730347045:role/cert-manager
http01:
ingress:
class: nginx
and applied it via kubectl apply -f issuer.yaml
created 2 certificates in the same file and applied it again:
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: letsencrypt-certificate
spec:
secretName: tls-secret
issuerRef:
kind: ClusterIssuer
name: letsencrypt-backoffice
commonName: test.projectname.org
dnsNames:
- test.projectname.org
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: letsencrypt-certificate-2
spec:
secretName: tls-secret-2
issuerRef:
kind: ClusterIssuer
name: letsencrypt-backoffice
commonName: test2.projectname.org
dnsNames:
- test2.projectname.org
made sure that the certificates are issued correctly (skipping the pain part, the result is: kubectl get certificates shows that both certificates have READY = true and both tls secrets are created)
figured that my ingress is in another namespace and secrets for tls in ingress spec can only be referred in the same namespace (haven't tried the wildcard certificate and --default-ssl-certificate option yet), so for each one copied them to tests namespace:
opened existing secret, like kubectl edit secret tls-secret-2, copied data and annotations
created an empty (Opaque) secret in tests: kubectl create secret generic tls-secret-2-copy -n tests
opened it (kubectl edit secret tls-secret-2-copy -n tests) and inserted data an annotations
in ingress spec, added the tls bit:
tls:
- hosts:
- test.projectname.org
secretName: tls-secret-copy
- hosts:
- test2.projectname.org
secretName: tls-secret-2-copy
I hoped that this will help, but actually it made no difference (I get ERR_TOO_MANY_REDIRECTS for irrelevant paths, redirect from http to https, NET::ERR_CERT_AUTHORITY_INVALID at https and Cannot GET /test if I insist on getting to the page)
Since you've used your own answer to complement the question, I'll kind of answer all the things you asked, while providing a divide and conquer strategy to troubleshooting kubernetes networking.
At the end I'll give you some nginx and IP answers
This is correct
- host: test3.projectname.org
http:
paths:
- pathType: Prefix
path: "/"
backend:
service:
name: hello-kubernetes-first
port:
number: 80
Breaking down troubleshooting with Ingress
DNS
Ingress
Service
Pod
Certificate
1.DNS
you can use the command dig to query the DNS
dig google.com
Ingress
the ingress controller doesn't look for the IP, it just looks for the headers
you can force a host using any tool that lets you change the headers, like curl
curl --header 'Host: test3.projectname.com' http://123.123.123.123 (your public IP)
Service
you can be sure that your service is working by creating ubuntu/centos pod, using kubectl exec -it podname -- bash and trying to curl your service form withing the cluster
Pod
You're getting this
192.168.14.57 - - [14/Nov/2021:12:02:58 +0000] "GET /test2 HTTP/2.0" 404 144
"-" "<browser's user-agent header value>" 448 0.002
This part GET /test2 means that the request got the address from the DNS, went all the way from the internet, found your clusters, found your ingress controller, got through the service and reached your pod. Congratz! Your ingress is working!
But why is it returning 404?
The path that was passed to the service and from the service to the pod is /test2
Do you have a file called test2 that nginx can serve? Do you have an upstream config in nginx that has a test2 prefix?
That's why, you're getting a 404 from nginx, not from the ingress controller.
Those IPs are internal, remember, the internet traffic ended at the cluster border, now you're in an internal network. Here's a rough sketch of what's happening
Let's say that you're accessing it from your laptop. Your laptop has the IP 192.168.123.123, but your home has the address 7.8.9.1, so when your request hits the cluster, the cluster sees 7.8.9.1 requesting test3.projectname.com.
The cluster looks for the ingress controller, which finds a suitable configuration and passed the request down to the service, which passes the request down to the pod.
So,
your router can see your private IP (192.168.123.123)
Your cluster(ingress) can see your router's IP (7.8.9.1)
Your service can see the ingress's IP (192.168.?.?)
Your pod can see the service's IP (192.168.14.57)
It's a game of pass around.
If you want to see the public IP in your nginx logs, you need to customize it to get the X-Real-IP header, which is usually where load-balancers/ingresses/ambassador/proxies put the actual requester public IP.
Well, I haven't figured this out for ArgoCD yet (edit: figured, but the solution is ArgoCD-specific), but for this test service it seems that path resolving is the source of the issue. It may be not the only source (to be retested on test2 subdomain), but when I created a new subdomain in the hosted zone (test3, not used anywhere before) and pointed it via A entry to the load balancer (as "alias" in AWS console), and then added to the ingress a new rule with / path, like this:
- host: test3.projectname.org
http:
paths:
- pathType: Prefix
path: "/"
backend:
service:
name: hello-kubernetes-first
port:
number: 80
I've finally got the hello kubernetes thing on http://test3.projectname.org. I have succeeded with TLS after a number of attempts/research and some help in a separate question.
But I haven't succeeded with actual debugging: looking at kubectl logs -n nginx <pod name, see kubectl get pod -n nginx> doesn't really help understanding what path was passed to the service and is rather difficult to understand (can't even find where those IPs come from: they are not mine, LB's, cluster IP of the service; neither I understand what tests-hello-kubernetes-first-80 stands for – it's just a concatenation of namespace, service name and port, no object has such name, including ingress):
192.168.14.57 - - [14/Nov/2021:12:02:58 +0000] "GET /test2 HTTP/2.0" 404 144
"-" "<browser's user-agent header value>" 448 0.002
[tests-hello-kubernetes-first-80] [] 192.168.49.95:8080 144 0.000 404 <some hash>
Any more pointers on debugging will be helpful; also suggestions regarding correct path ~rewriting for nginx-ingress are welcome.

Kunernetes/kustomize Service endpoints abnormal behavior

We're using kustomize with kubernetes on our project.
I'm trying to implement access to external service using IP as mentioned in this link
https://medium.com/#ManagedKube/kubernetes-access-external-services-e4fd643e5097
Here's my service
---
kind: Service
apiVersion: v1
metadata:
name: pgsql
spec:
ports:
- protocol: TCP
port: 5432
targetPort: 5432
name: "pg"
selector: {}
---
apiVersion: v1
kind: Endpoints
metadata:
name: pgsql
subsets:
- addresses:
- ip: 1.1.1.1
ports:
- port: 5432
name : "pg"
When I apply with kubectl command (kubectl apply -k ...) I have a warning
Warning: kubectl apply should be used on resource created by either
kubectl create --save-config or kubectl apply
However, this warning does not avoid endpoints and service creation.
kubectl get endpoints
NAME ENDPOINTS AGE
pgsql 172.12.xx.yy:5432 3m27s
Unfortunately, the ip address is different from the one I put in my yml (1.1.1.1)
If I apply a second time
kubectl apply -k ...
kubectl get endpoints
NAME ENDPOINTS AGE
pgsql 1.1.1.1:5432 10s
I do not have the warning above anymore.
The endpoint is the one expected.
I expect endpoint address to be the exact one (1.1.1.1:5432) from the first apply.
Any suggestions?
Thanks
It probably comes from the empty selector. Could you try to remove it completely?
This is supposed to work only if your service doesn't have any selector

How to Configure Kubernetes in Hairpin Mode

I'm trying to enable hairpin connections on my Kubernetes service, on GKE.
I've tried to follow the instructions here: https://kubernetes.io/docs/tasks/administer-cluster/reconfigure-kubelet/ to configure my kubelet config to enable hairpin mode, but it looks like my configs are never saved, even though the edit command returns without error.
Here is what I try to set when I edit node:
spec:
podCIDR: 10.4.1.0/24
providerID: gce://staging/us-east4-b/gke-cluster-staging-highmem-f36fb529-cfnv
configSource:
configMap:
name: my-node-config-4kbd7d944d
namespace: kube-system
kubeletConfigKey: kubelet
Here is my node config when I describe it
Name: my-node-config-4kbd7d944d
Namespace: kube-system
Labels: <none>
Annotations: <none>
Data
====
kubelet_config:
----
{
"kind": "KubeletConfiguration",
"apiVersion": "kubelet.config.k8s.io/v1beta1",
"hairpinMode": "hairpin-veth"
}
I've tried both using "edit node" and "patch". Same result in that nothing is saved. Patch returns "no changes made."
Here is the patch command from the tutorial:
kubectl patch node ${NODE_NAME} -p "{\"spec\":{\"configSource\":{\"configMap\":{\"name\":\"${CONFIG_MAP_NAME}\",\"namespace\":\"kube-system\",\"kubeletConfigKey\":\"kubelet\"}}}}"
I also can't find any resource on where the "hairpinMode" attribute is supposed to be set.
Any help is appreciated!
------------------- edit ----------------
here is why I think hairpinning isn't working.
root#668cb9686f-dzcx8:/app# nslookup tasks-staging.[my-domain].com
Server: 10.0.32.10
Address: 10.0.32.10#53
Non-authoritative answer:
Name: tasks-staging.[my-domain].com
Address: 34.102.170.43
root#668cb9686f-dzcx8:/app# curl https://[my-domain].com/python/healthz
hello
root#668cb9686f-dzcx8:/app# nslookup my-service.default
Server: 10.0.32.10
Address: 10.0.32.10#53
Name: my-service.default.svc.cluster.local
Address: 10.0.38.76
root#668cb9686f-dzcx8:/app# curl https://my-service.default.svc.cluster.local/python/healthz
curl: (7) Failed to connect to my-service.default.svc.cluster.local port 443: Connection timed out
also if I issue a request to localhost from my service (not curl), it gets a "connection refused." Issuing requests to the external domain, which should get routed to the same pod, is fine though.
I only have one service, one node, one pod, and two listening ports at the moment.
--------------------- including deployment yaml -----------------
Deployment
spec:
replicas: 1
spec:
containers:
- name: my-app
ports:
- containerPort: 8080
- containerPort: 50001
readinessProbe:
httpGet:
path: /healthz
port: 8080
scheme: HTTPS
Ingress:
apiVersion: extensions/v1beta1
kind: Ingress
spec:
backend:
serviceName: my-service
servicePort: 60000
rules:
- http:
paths:
- path: /*
backend:
serviceName: my-service
servicePort: 60000
- path: /python/*
backend:
serviceName: my-service
servicePort: 60001
service
---
apiVersion: v1
kind: Service
metadata:
name: my-service
spec:
ports:
- name: port
port: 60000
targetPort: 8080
- name: python-port
port: 60001
targetPort: 50001
type: NodePort
I'm trying to set up a multi-port application where the main program trigger a script to run through issuing a request on the local machine on a different port. (I need to run something in python but the main app is in golang.)
It's a simple script and I'd like to avoid exposing the python endpoints with the external domain, so I don't have to worry about authentication, etc.
-------------- requests sent from my-service in golang -------------
https://[my-domain]/health: success
https://[my-domain]/python/healthz: success
http://my-service.default:60000/healthz: dial tcp: lookup my-service.default on 169.254.169.254:53: no such host
http://my-service.default/python/healthz: dial tcp: lookup my-service.default on 169.254.169.254:53: no such host
http://my-service.default:60001/python/healthz: dial tcp: lookup my-service.default on 169.254.169.254:53: no such host
http://localhost:50001/healthz: dial tcp 127.0.0.1:50001: connect: connection refused
http://localhost:50001/python/healthz: dial tcp 127.0.0.1:50001: connect: connection refused
Kubelet reconfiguration in GKE
You should not reconfigure kubelet in cloud managed Kubernetes clusters like GKE. It's not supported and it can lead to errors and failures.
Hairpinning in GKE
Hairpinning is enabled by default in GKE provided clusters. You can check if it's enabled by invoking below command on one of the GKE nodes:
ifconfig cbr0 |grep PROMISC
The output should look like that:
UP BROADCAST RUNNING PROMISC MULTICAST MTU:1460 Metric:1
Where the PROMISC will indicate that the hairpinning is enabled.
Please refer to official documentation about debugging services: Kubernetes.io: Debug service: a pod fails to reach itself via the service ip
Workload
Basing only on service definition you provided, you should have an access to your python application on port 50001 with a pod hosting it with:
localhost:50001
ClusterIP:60001
my-service:60001
NodeIP:nodeport-port (check $ kubectl get svc my-service for this port)
I tried to run your Ingress resource and it failed to create. Please check how Ingress definition should look like.
Please take a look on official documentation where whole deployment process is explained with examples:
Kubernetes.io: Connect applications service
Cloud.google.com: Kubernetes engine: Ingress
Cloud.google.com: Kubernetes engine: Load balance ingress
Additionally please check other StackOverflow answers like:
Stackoverflow.com: Kubernetes how to access service if nodeport is random - it describes how you can access application in your pod
Stackoverflow.com: What is the purpose of kubectl proxy - it describes what happen when you create your service object.
Please let me know if you have any questions to that.

Is there a way to do a load balancing between pod in multiple nodes?

I have a kubernetes cluster deployed with rke witch is composed of 3 nodes in 3 different servers and in those server there is 1 pod which is running yatsukino/healthereum which is a personal modification of ethereum/client-go:stable .
The problem is that I'm not understanding how to add an external ip to send request to the pods witch are
My pods could be in 3 states:
they syncing the ethereum blockchain
they restarted because of a sync problem
they are sync and everything is fine
I don't want my load balancer to transfer requests to the 2 first states, only the third point consider my pod as up to date.
I've been searching in the kubernetes doc but (maybe because a miss understanding) I only find load balancing for pods inside a unique node.
Here is my deployment file:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: goerli
name: goerli-deploy
spec:
replicas: 3
selector:
matchLabels:
app: goerli
template:
metadata:
labels:
app: goerli
spec:
containers:
- image: yatsukino/healthereum
name: goerli-geth
args: ["--goerli", "--datadir", "/app", "--ipcpath", "/root/.ethereum/geth.ipc"]
env:
- name: LASTBLOCK
value: "0"
- name: FAILCOUNTER
value: "0"
ports:
- containerPort: 30303
name: geth
- containerPort: 8545
name: console
livenessProbe:
exec:
command:
- /bin/sh
- /app/health.sh
initialDelaySeconds: 20
periodSeconds: 60
volumeMounts:
- name: app
mountPath: /app
initContainers:
- name: healthcheck
image: ethereum/client-go:stable
command: ["/bin/sh", "-c", "wget -O /app/health.sh http://my-bash-script && chmod 544 /app/health.sh"]
volumeMounts:
- name: app
mountPath: "/app"
restartPolicy: Always
volumes:
- name: app
hostPath:
path: /app/
The answers above explains the concepts, but about your questions anout services and external ip; you must declare the service, example;
apiVersion: v1
kind: Service
metadata:
name: goerli
spec:
selector:
app: goerli
ports:
- port: 8545
type: LoadBalancer
The type: LoadBalancer will assign an external address for in public cloud or if you use something like metallb. Check your address with kubectl get svc goerli. If the external address is "pending" you have a problem...
If this is your own setup you can use externalIPs to assign your own external ip;
apiVersion: v1
kind: Service
metadata:
name: goerli
spec:
selector:
app: goerli
ports:
- port: 8545
externalIPs:
- 222.0.0.30
The externalIPs can be used from outside the cluster but you must route traffic to any node yourself, for example;
ip route add 222.0.0.30/32 \
nexthop via 192.168.0.1 \
nexthop via 192.168.0.2 \
nexthop via 192.168.0.3
Assuming yous k8s nodes have ip 192.168.0.x. This will setup ECMP routes to your nodes. When you make a request from outside the cluster to 222.0.0.30:8545 k8s will load-balance between your ready PODs.
For loadbalancing and exposing your pods, you can use https://kubernetes.io/docs/concepts/services-networking/service/
and for checking when a pod is ready, you can use tweak your liveness and readiness probes as explained https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/
for probes you might want to consider exec actions like execution a script that checks what is required and returning 0 or 1 dependent on status.
When a container is started, Kubernetes can be configured to wait for a configurable
amount of time to pass before performing the first readiness check. After that, it
invokes the probe periodically and acts based on the result of the readiness probe. If a
pod reports that it’s not ready, it’s removed from the service. If the pod then becomes
ready again, it’s re-added.
Unlike liveness probes, if a container fails the readiness check, it won’t be killed or
restarted. This is an important distinction between liveness and readiness probes.
Liveness probes keep pods healthy by killing off unhealthy containers and replacing
them with new, healthy ones, whereas readiness probes make sure that only pods that
are ready to serve requests receive them. This is mostly necessary during container
start up, but it’s also useful after the container has been running for a while.
I think you can use probe for your goal

Unable to scrape metrics from pods

I am able to scrape Prometheus metrics from a Kubernetes service using this Prometheus job configuration:
- job_name: 'prometheus-potapi'
static_configs:
- targets: ['potapi-service.potapi:1234']
It uses Kubernetes DNS and it gives me the metrics from any of my three pods I use for my service.
I would like to see the result for each pod.
I am able to see the data I want using this configuration:
- job_name: 'prometheus-potapi-pod'
static_configs:
- targets: ['10.1.0.126:1234']
I have searched and experimented using the service discovery mechanism available in Prometheus. Unfortunately, I don't understand how it should be setup. The service discovery reference isn't really helpful if you don't know how it works.
I am looking for an example where the job using the IP number is replaced with some service discovery mechanism. Specifying the IP was enough for me to see that the data I'm looking for is exposed.
The pods I want to scrape metrics from all live in the same namespace, potapi.
The metrics are always exposed through the same port, 1234.
Finally, the are all named like this:
potapi-deployment-754d96f855-lkh4x
potapi-deployment-754d96f855-pslgg
potapi-deployment-754d96f855-z2zj2
When I do
kubectl describe pod potapi-deployment-754d96f855-pslgg -n potapi
I get this description:
Name: potapi-deployment-754d96f855-pslgg
Namespace: potapi
Node: docker-for-desktop/192.168.65.3
Start Time: Tue, 07 Aug 2018 14:18:55 +0200
Labels: app=potapi
pod-template-hash=3108529411
Annotations: <none>
Status: Running
IP: 10.1.0.127
Controlled By: ReplicaSet/potapi-deployment-754d96f855
Containers:
potapi:
Container ID: docker://72a0bafbda9b82ddfc580d79488a8e3c480d76a6d17c43d7f7d7ab18458c56ee
Image: potapi-service
Image ID: docker://sha256:d64e94c2dda43c40f641008c122e6664845d73cab109768efa0c3619cb0836bb
Ports: 4567/TCP, 4568/TCP, 1234/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
State: Running
Started: Tue, 07 Aug 2018 14:18:57 +0200
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-4fttn (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
default-token-4fttn:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-4fttn
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events: <none>
How would you rewrite the job definition given these prerequisites?
Here they use example.io/scrape=true (and similar annotations for specifying the scrape port and the scrape path if it's not /metrics), which is how one achieves the "autodiscovery" part.
If you apply that annotation -- and the relevant config snippets in the Prom config -- to a Service, then Prom will scrape the port and path on the Service, meaning you will have stats for the Service itself, and not the individual Endpoints behind it. Similarly, if you label the Pods, you will gather metrics for the Pods but they would need to be rolled up to have a cross-Pod view of the state of affairs. There are multiple different resource types that can be autodiscovered, including node and ingress, also. They all behave similarly.
Unless you have grave CPU or storage concerns for your Prom instance, I absolutely wouldn't enumerate the scrape targets in the config like that: I would use the scrape annotations, meaning you can change who is scraped, what port, etc. without having to reconfigure Prom each time.
Be aware that if you want to use their example as-is, and you want to apply those annotations from within the kubernetes resource YAML, ensure that you quote the : 'true' value, otherwise YAML will promote that to be a boolean literal, and kubernetes annotations can only be string values.
Applying the annotations from the command line will work just fine:
kubectl annotate pod -l app=potapi example.io/scrape=true
(BTW, they use example.io/ in their example, but there is nothing special about that string except it namespaces the scrape part to keep it from colliding with something else named scrape. So feel free to use your organization's namespace if you wish to avoid having something weird named example.io/ in your cluster)
I ended up with this solution:
...
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__address__]
action: replace
regex: ([^:]+)(?::\d+)?
replacement: $1:1234
target_label: __address__
...
There are two parts.
Check for an annotation prometheus.io/scrape with the value 'true'. It is done in the first source_labels.
It may not be self evident that prometheus_io_scrape translates to prometheus.io/scrape
Get the adress and add the desired port to it. It is done on the second source_labels. The __address__ source will be queried for a host name or ip number. In this case, a ip number is found using the cryptic regex ([^:]+)(?::\d+)?. The port I want to use is ´1234´ so I hardcoded it in replacement: The result is that the
__address__ now will contain the ip of the pod with the port 1234 attached on the format 10.1.0.172:1234 where 10.1.0.172 is the ip number found.
With this configuration in Prometheus I should be able to find pods with the proper annotation.
Where should the annotation be added then? I ended up adding it in my Kubernetes deployment template description.
The complete deployment description looks like this:
apiVersion: apps/v1
kind: Deployment
metadata:
name: potapi-deployment
namespace: potapi
labels:
app: potapi
spec:
replicas: 3
selector:
matchLabels:
app: potapi
template:
metadata:
annotations:
prometheus.io/scrape: 'true'
labels:
app: potapi
spec:
containers:
- name: potapi
image: potapi-service
imagePullPolicy: IfNotPresent
ports:
- containerPort: 4567
name: service
- containerPort: 1234
name: metrics
The interesting annotation is added in the template section