certificate is valid for ingress.local, not gitlab.mydomain - kubernetes

I'm new to Kubernetes.
I have installed a freshly new Kubernetes Cluster by using RKE (rancher tool for creating k8 clusters).
I added the gitlab chart (https://charts.gitlab.io/) and launch it.
Being on several issues with PersistentStorage, etc that I managed to resolve.
But I'm now stuck on one last issue: the pod for gitlab-runner is failing with the following logs:
ERROR: Registering runner... failed runner=Mk5hMxa5 status=couldn't execute POST against https://gitlab.mydomain.com/api/v4/runners: Post https://gitlab.mydomain.com/api/v4/runners: x509: certificate is valid for ingress.local, not gitlab.mydomain.com
PANIC: Failed to register this runner. Perhaps you are having network problems
Description of the certificate using kubectl describe certificate gitlab-gitlab-tls -n gitlab:
Name: gitlab-gitlab-tls
Namespace: gitlab
Labels: app=unicorn
chart=unicorn-2.4.6
heritage=Tiller
io.cattle.field/appId=gitlab
release=gitlab
Annotations: <none>
API Version: certmanager.k8s.io/v1alpha1
Kind: Certificate
Metadata:
Creation Timestamp: 2019-11-13T13:49:10Z
Generation: 3
Owner References:
API Version: extensions/v1beta1
Block Owner Deletion: true
Controller: true
Kind: Ingress
Name: gitlab-unicorn
UID: 5640645f-550b-4073-bdf0-df8b089b0c94
Resource Version: 6824
Self Link: /apis/certmanager.k8s.io/v1alpha1/namespaces/gitlab/certificates/gitlab-gitlab-tls
UID: 30ac32bd-c7f3-4f9b-9e3b-966b6090e1a9
Spec:
Acme:
Config:
Domains:
gitlab.mydomain.com
http01:
Ingress Class: gitlab-nginx
Dns Names:
gitlab.mydomain.com
Issuer Ref:
Kind: Issuer
Name: gitlab-issuer
Secret Name: gitlab-gitlab-tls
Status:
Conditions:
Last Transition Time: 2019-11-13T13:49:10Z
Message: Certificate issuance in progress. Temporary certificate issued.
Reason: TemporaryCertificate
Status: False
Type: Ready
Events: <none>
Description of the issuer using kubectl describe issuer gitlab-issuer -n gitlab:
Name: gitlab-issuer
Namespace: gitlab
Labels: app=certmanager-issuer
chart=certmanager-issuer-0.1.0
heritage=Tiller
release=gitlab
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"certmanager.k8s.io/v1alpha1","kind":"Issuer","metadata":{"annotations":{},"creationTimestamp":"2019-11-13T13:49:10Z","gener...
API Version: certmanager.k8s.io/v1alpha1
Kind: Issuer
Metadata:
Creation Timestamp: 2019-11-13T13:49:10Z
Generation: 4
Resource Version: 24537
Self Link: /apis/certmanager.k8s.io/v1alpha1/namespaces/gitlab/issuers/gitlab-issuer
UID: b9971d7a-5220-47ca-a7f9-607aa3f9be4f
Spec:
Acme:
Email: mh#mydomain.com
http01:
Private Key Secret Ref:
Name: gitlab-acme-key
Server: https://acme-v02.api.letsencrypt.org/directory
Status:
Acme:
Last Registered Email: mh#mydomain.com
Uri: https://acme-v02.api.letsencrypt.org/acme/acct/71695690
Conditions:
Last Transition Time: 2019-11-13T13:49:12Z
Message: The ACME account was registered with the ACME server
Reason: ACMEAccountRegistered
Status: True
Type: Ready
Events: <none>
Description of the challenge using kubectl describe challenges.certmanager.k8s.io -n gitlab gitlab-gitlab-tls-3386074437-0:
Name: gitlab-gitlab-tls-3386074437-0
Namespace: gitlab
Labels: acme.cert-manager.io/order-name=gitlab-gitlab-tls-3386074437
Annotations: <none>
API Version: certmanager.k8s.io/v1alpha1
Kind: Challenge
Metadata:
Creation Timestamp: 2019-11-13T13:49:15Z
Finalizers:
finalizer.acme.cert-manager.io
Generation: 4
Owner References:
API Version: certmanager.k8s.io/v1alpha1
Block Owner Deletion: true
Controller: true
Kind: Order
Name: gitlab-gitlab-tls-3386074437
UID: 1f01771e-2e38-491f-9b2d-ab5f4fda60e2
Resource Version: 6915
Self Link: /apis/certmanager.k8s.io/v1alpha1/namespaces/gitlab/challenges/gitlab-gitlab-tls-3386074437-0
UID: 4c115a6f-a76f-4859-a5db-6acd9c039d71
Spec:
Authz URL: https://acme-v02.api.letsencrypt.org/acme/authz-v3/1220588820
Config:
http01:
Ingress Class: gitlab-nginx
Dns Name: gitlab.mydomain.com
Issuer Ref:
Kind: Issuer
Name: gitlab-issuer
Key: lSJdy9Os7BmI56EQCkcEl8t36pcR1hWNjri2Vvq0iv8.lPWns02SmS3zXwFzHdma_RyhwwlzWLRDkdlugFXDlZY
Token: lSJdy9Os7BmI56EQCkcEl8t36pcR1hWNjri2Vvq0iv8
Type: http-01
URL: https://acme-v02.api.letsencrypt.org/acme/chall-v3/1220588820/AwsnPw
Wildcard: false
Status:
Presented: true
Processing: true
Reason: Waiting for http-01 challenge propagation: wrong status code '404', expected '200'
State: pending
Events: <none>
Logs found in cert-manager pod:
I1113 14:20:21.857235 1 pod.go:58] cert-manager/controller/challenges/http01/selfCheck/http01/ensurePod "level"=0 "msg"="found one existing HTTP01 solver pod" "dnsName"="gitlab.mydomain.com" "related_resource_kind"="Pod" "related_resource_name"="cm-acme-http-solver-ttkmj" "related_resource_namespace"="gitlab" "resource_kind"="Challenge" "resource_name"="gitlab-gitlab-tls-3386074437-0" "resource_namespace"="gitlab" "type"="http-01"
I1113 14:20:21.857458 1 service.go:43] cert-manager/controller/challenges/http01/selfCheck/http01/ensureService "level"=0 "msg"="found one existing HTTP01 solver Service for challenge resource" "dnsName"="gitlab.mydomain.com" "related_resource_kind"="Service" "related_resource_name"="cm-acme-http-solver-sdlw7" "related_resource_namespace"="gitlab" "resource_kind"="Challenge" "resource_name"="gitlab-gitlab-tls-3386074437-0" "resource_namespace"="gitlab" "type"="http-01"
I1113 14:20:21.857592 1 ingress.go:91] cert-manager/controller/challenges/http01/selfCheck/http01/ensureIngress "level"=0 "msg"="found one existing HTTP01 solver ingress" "dnsName"="gitlab.mydomain.com" "related_resource_kind"="Ingress" "related_resource_name"="cm-acme-http-solver-7jzwk" "related_resource_namespace"="gitlab" "resource_kind"="Challenge" "resource_name"="gitlab-gitlab-tls-3386074437-0" "resource_namespace"="gitlab" "type"="http-01"
E1113 14:20:21.864785 1 sync.go:183] cert-manager/controller/challenges "msg"="propagation check failed" "error"="wrong status code '404', expected '200'" "dnsName"="gitlab.mydomain.com" "resource_kind"="Challenge" "resource_name"="gitlab-gitlab-tls-3386074437-0" "resource_namespace"="gitlab" "type"="http-01"
The DNS gitlab.mydomain.com is set to point to the IP of my LoadBalancer where NGINX is running.
If I go to https://gitlab.mydomain.com in the browser:
The browser is saying the connexion is not secure
The result is "default backend - 404".
Edits
Description of the ingress-controller by using kubectl describe svc gitlab-nginx-ingress-controller -n gitlab:
Name: gitlab-nginx-ingress-controller
Namespace: gitlab
Labels: app=nginx-ingress
chart=nginx-ingress-0.30.0-1
component=controller
heritage=Tiller
io.cattle.field/appId=gitlab
release=gitlab
Annotations: field.cattle.io/ipAddresses: null
field.cattle.io/targetDnsRecordIds: null
field.cattle.io/targetWorkloadIds: null
Selector: <none>
Type: ExternalName
IP:
External Name: gitlab.mydomain.com
Port: http 80/TCP
TargetPort: http/TCP
NodePort: http 31487/TCP
Endpoints: 10.42.0.7:80,10.42.1.9:80,10.42.2.12:80
Port: https 443/TCP
TargetPort: https/TCP
NodePort: https 31560/TCP
Endpoints: 10.42.0.7:443,10.42.1.9:443,10.42.2.12:443
Port: gitlab-shell 22/TCP
TargetPort: gitlab-shell/TCP
NodePort: gitlab-shell 30539/TCP
Endpoints: 10.42.0.7:22,10.42.1.9:22,10.42.2.12:22
Session Affinity: None
Events: <none>
Running kubectl get ingress -n gitlab gives me a bunch of ingress:
NAME HOSTS ADDRESS PORTS AGE
cm-acme-http-solver-5rjg4 minio.mydomain.com gitlab.mydomain.com 80 4d23h
cm-acme-http-solver-7jzwk gitlab.mydomain.com gitlab.mydomain.com 80 4d23h
cm-acme-http-solver-tzs25 registry.mydomain.com gitlab.mydomain.com 80 4d23h
gitlab-minio minio.mydomain.com gitlab.mydomain.com 80, 443 4d23h
gitlab-registry registry.mydomain.com gitlab.mydomain.com 80, 443 4d23h
gitlab-unicorn gitlab.mydomain.com gitlab.mydomain.com 80, 443 4d23h
Description of the gitlab-unicorn by using kubectl describe ingress gitlab-unicron -n gitlab
Name: gitlab-unicorn
Namespace: gitlab
Address: gitlab.mydomain.com
Default backend: default-http-backend:80 (<none>)
TLS:
gitlab-gitlab-tls terminates gitlab.mydomain.com
Rules:
Host Path Backends
---- ---- --------
gitlab.mydomain.com
/ gitlab-unicorn:8181 (10.42.0.9:8181,10.42.1.8:8181)
/admin/sidekiq gitlab-unicorn:8080 (10.42.0.9:8080,10.42.1.8:8080)
Annotations:
certmanager.k8s.io/issuer: gitlab-issuer
field.cattle.io/publicEndpoints: [{"addresses":[""],"port":443,"protocol":"HTTPS","serviceName":"gitlab:gitlab-unicorn","ingressName":"gitlab:gitlab-unicorn","hostname":"gitlab.mydomain.com","path":"/","allNodes":false},{"addresses":[""],"port":443,"protocol":"HTTPS","serviceName":"gitlab:gitlab-unicorn","ingressName":"gitlab:gitlab-unicorn","hostname":"gitlab.mydomain.com","path":"/admin/sidekiq","allNodes":false}]
kubernetes.io/ingress.class: gitlab-nginx
kubernetes.io/ingress.provider: nginx
nginx.ingress.kubernetes.io/proxy-body-size: 512m
nginx.ingress.kubernetes.io/proxy-connect-timeout: 15
nginx.ingress.kubernetes.io/proxy-read-timeout: 600
Events: <none>
Description of cm-acme-http-solver-7jzwk by using kubectl describe ingress cm-acme-http-solver-7jzwk -n gitlab:
Name: cm-acme-http-solver-7jzwk
Namespace: gitlab
Address: gitlab.mydomain.com
Default backend: default-http-backend:80 (<none>)
Rules:
Host Path Backends
---- ---- --------
gitlab.mydomain.com
/.well-known/acme-challenge/lSJdy9Os7BmI56EQCkcEl8t36pcR1hWNjri2Vvq0iv8 cm-acme-http-solver-sdlw7:8089 (10.42.2.19:8089)
Annotations:
field.cattle.io/publicEndpoints: [{"addresses":[""],"port":80,"protocol":"HTTP","serviceName":"gitlab:cm-acme-http-solver-sdlw7","ingressName":"gitlab:cm-acme-http-solver-7jzwk","hostname":"gitlab.mydomain.com","path":"/.well-known/acme-challenge/lSJdy9Os7BmI56EQCkcEl8t36pcR1hWNjri2Vvq0iv8","allNodes":false}]
kubernetes.io/ingress.class: gitlab-nginx
nginx.ingress.kubernetes.io/whitelist-source-range: 0.0.0.0/0,::/0
Events: <none>
Ports open on my LoadBalancer and on every nodes of my cluster (I know I should close somes but I will first manage to make my gitlab setup working):
80/tcp ALLOW Anywhere
443/tcp ALLOW Anywhere
22/tcp ALLOW Anywhere
2376/tcp ALLOW Anywhere
2379/tcp ALLOW Anywhere
2380/tcp ALLOW Anywhere
6443/tcp ALLOW Anywhere
6783/tcp ALLOW Anywhere
6783:6784/udp ALLOW Anywhere
8472/udp ALLOW Anywhere
4789/udp ALLOW Anywhere
9099/tcp ALLOW Anywhere
10250/tcp ALLOW Anywhere
10254/tcp ALLOW Anywhere
30000:32767/tcp ALLOW Anywhere
30000:32767/udp ALLOW Anywhere
80/tcp (v6) ALLOW Anywhere (v6)
443/tcp (v6) ALLOW Anywhere (v6)
22/tcp (v6) ALLOW Anywhere (v6)
2376/tcp (v6) ALLOW Anywhere (v6)
2379/tcp (v6) ALLOW Anywhere (v6)
2380/tcp (v6) ALLOW Anywhere (v6)
6443/tcp (v6) ALLOW Anywhere (v6)
6783/tcp (v6) ALLOW Anywhere (v6)
6783:6784/udp (v6) ALLOW Anywhere (v6)
8472/udp (v6) ALLOW Anywhere (v6)
4789/udp (v6) ALLOW Anywhere (v6)
9099/tcp (v6) ALLOW Anywhere (v6)
10250/tcp (v6) ALLOW Anywhere (v6)
10254/tcp (v6) ALLOW Anywhere (v6)
30000:32767/tcp (v6) ALLOW Anywhere (v6)
30000:32767/udp (v6) ALLOW Anywhere (v6)
kubectl get pods -n gitlab
cm-acme-http-solver-4d8s5 1/1 Running 0 5d
cm-acme-http-solver-ttkmj 1/1 Running 0 5d
cm-acme-http-solver-ws7kv 1/1 Running 0 5d
gitlab-certmanager-57bc6fb4fd-6rfds 1/1 Running 0 5d
gitlab-gitaly-0 1/1 Running 0 5d
gitlab-gitlab-exporter-57b99467d4-knbgk 1/1 Running 0 5d
gitlab-gitlab-runner-64b74bcd59-mxwvm 0/1 CrashLoopBackOff 10 55m
gitlab-gitlab-shell-cff8b68f7-zng2c 1/1 Running 0 5d
gitlab-gitlab-shell-cff8b68f7-zqvfr 1/1 Running 0 5d
gitlab-issuer.1-lqs7c 0/1 Completed 0 5d
gitlab-migrations.1-c4njn 0/1 Completed 0 5d
gitlab-minio-75567fcbb6-jjxhw 1/1 Running 6 5d
gitlab-minio-create-buckets.1-6zljh 0/1 Completed 0 5d
gitlab-nginx-ingress-controller-698fbc4c64-4wt97 1/1 Running 0 5d
gitlab-nginx-ingress-controller-698fbc4c64-5kv2h 1/1 Running 0 5d
gitlab-nginx-ingress-controller-698fbc4c64-jxljq 1/1 Running 0 5d
gitlab-nginx-ingress-default-backend-6cd54c5f86-2jrkd 1/1 Running 0 5d
gitlab-nginx-ingress-default-backend-6cd54c5f86-cxlmx 1/1 Running 0 5d
gitlab-postgresql-66d8d9574b-hbx78 2/2 Running 0 5d
gitlab-prometheus-server-6fb685b9c7-c8bqj 2/2 Running 0 5d
gitlab-redis-7668c4d476-tcln5 2/2 Running 0 5d
gitlab-registry-7bb984c765-7ww6j 1/1 Running 0 5d
gitlab-registry-7bb984c765-t5jjq 1/1 Running 0 5d
gitlab-sidekiq-all-in-1-8fd95bf7b-hfnjz 1/1 Running 0 5d
gitlab-task-runner-5cd7bf5bb9-gnv8p 1/1 Running 0 5d
gitlab-unicorn-864bd864f5-47zxg 2/2 Running 0 5d
gitlab-unicorn-864bd864f5-gjms2 2/2 Running 0 5d
Their are 3 acme-http-solver:
One for registry.mydomain.com
One for minio.mydomain.com
One for gitlab.mydomain.com
The logs for the one pointing to gitlab.mydomain.com:
I1113 13:49:21.207782 1 solver.go:39] cert-manager/acmesolver "level"=0 "msg"="starting listener" "expected_domain"="gitlab.mydomain.com" "expected_key"="lSJdy9Os7BmI56EQCkcEl8t36pcR1hWNjri2Vvq0iv8.lPWns02SmS3zXwFzHdma_RyhwwlzWLRDkdlugFXDlZY" "expected_token"="lSJdy9Os7BmI56EQCkcEl8t36pcR1hWNjri2Vvq0iv8" "listen_port"=8089
Results of kubectl get svc -n gitlab:
cm-acme-http-solver-48b2j NodePort 10.43.58.52 <none> 8089:30090/TCP 5d23h
cm-acme-http-solver-h42mk NodePort 10.43.23.141 <none> 8089:30415/TCP 5d23h
cm-acme-http-solver-sdlw7 NodePort 10.43.86.27 <none> 8089:32309/TCP 5d23h
gitlab-gitaly ClusterIP None <none> 8075/TCP,9236/TCP 5d23h
gitlab-gitlab-exporter ClusterIP 10.43.187.247 <none> 9168/TCP 5d23h
gitlab-gitlab-shell ClusterIP 10.43.246.124 <none> 22/TCP 5d23h
gitlab-minio-svc ClusterIP 10.43.117.249 <none> 9000/TCP 5d23h
gitlab-nginx-ingress-controller ExternalName <none> gitlab.mydomain.com 80:31487/TCP,443:31560/TCP,22:30539/TCP 5d23h
gitlab-nginx-ingress-controller-metrics ClusterIP 10.43.152.252 <none> 9913/TCP 5d23h
gitlab-nginx-ingress-controller-stats ClusterIP 10.43.173.191 <none> 18080/TCP 5d23h
gitlab-nginx-ingress-default-backend ClusterIP 10.43.116.121 <none> 80/TCP 5d23h
gitlab-postgresql ClusterIP 10.43.97.139 <none> 5432/TCP 5d23h
gitlab-prometheus-server ClusterIP 10.43.67.220 <none> 80/TCP 5d23h
gitlab-redis ClusterIP 10.43.36.138 <none> 6379/TCP,9121/TCP 5d23h
gitlab-registry ClusterIP 10.43.54.244 <none> 5000/TCP 5d23h
gitlab-unicorn ClusterIP 10.43.76.61 <none> 8080/TCP,8181/TCP 5d23h
Logs of the pod gitlab-nginx-ingress-controller-698fbc4c64-jxljq (others nginx-ingress-controller gives same logs): https://textuploader.com/1o9we
Any hint on what could be wrong in my configuration ?
Fell free to ask for more information on my setup.
Many thanks.

Well the issue is, Gitlab requires a valid SSL certificate for the domain in question. Which you do not seem to have according to the output of:
E1113 14:20:21.864785 1 sync.go:183] cert-manager/controller/challenges "msg"="propagation check failed" "error"="wrong status code '404', expected '200'" "dnsName"="gitlab.mydomain.com" "resource_kind"="Challenge" "resource_name"="gitlab-gitlab-tls-3386074437-0" "resource_namespace"="gitlab" "type"="http-01"
Status:
Presented: true
Processing: true
Reason: Waiting for http-01 challenge propagation: wrong status code '404', expected '200'
State: pending
The http-01 challenge is where it will try to do a web request to your domain, and it should return a 200 HTTP response. When you said yourself that https://gitlab.mydomain.com gives you a 404 response (hence it will fail to issue a valid certificate). To further diagnose this, check the output of the ingress responsible for the domain, and follow it down the "chain" until you identify where the 404 is being responded by.

The http01 challenge relies on port 80 (http) to be exposed to be able to answer the challenge. The option controller.service.enableHttp configures http, and is enabled by default (see here. But even if you've not touched this config, there might be an upstream component (i.e. a firewall) that blocks traffic on port 80.
Could you check if your ingress Service is listening on port 80, and reachable from the internet? You can try to go to your public IP on port 80 via a browser to check if you get a response from the ingress controller (or a backend).

Related

Nginx Ingress not working on k3s running on Raspberry Pi

I have k3s installed on 4 Raspberry Pi's with traefik disabled.
I'm trying to run Home assistant on it using Nginx Ingress controller, kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.3.0/deploy/static/provider/baremetal/deploy.yaml.
But for some reason, I just cannot expose the service. The ingress assigned 192.168.0.57, which is one of the nodes' IP. Am I missing something?
root#rpi1:~# kubectl get ingress -n home-assistant home-assistant-ingress
NAME CLASS HOSTS ADDRESS PORTS AGE
home-assistant-ingress nginx smart.home 192.168.0.57 80 20h
root#rpi1:~# curl http://192.168.0.57/
curl: (7) Failed to connect to 192.168.0.57 port 80: Connection refused
root#rpi1:~# curl http://smart.home/
curl: (7) Failed to connect to smart.home port 80: Connection refused
Please see following.
Pod:
root#rpi1:~# kubectl describe pod -n home-assistant home-assistant-deploy-7c4674b679-zbwn7
Name: home-assistant-deploy-7c4674b679-zbwn7
Namespace: home-assistant
Priority: 0
Node: rpi4/192.168.0.58
Start Time: Tue, 16 Aug 2022 20:31:28 +0100
Labels: app=home-assistant
pod-template-hash=7c4674b679
Annotations: <none>
Status: Running
IP: 10.42.3.7
IPs:
IP: 10.42.3.7
Controlled By: ReplicaSet/home-assistant-deploy-7c4674b679
Containers:
home-assistant:
Container ID: containerd://c7ec189112e9f2d085bd7f9cc7c8086d09b312e30771d7d1fef424685fcfbd07
Image: ghcr.io/home-assistant/home-assistant:stable
Image ID: ghcr.io/home-assistant/home-assistant#sha256:0555dc6a69293a1a700420224ce8d03048afd845465f836ef6ad60f5763b44f2
Port: <none>
Host Port: <none>
State: Running
Started: Wed, 17 Aug 2022 18:06:16 +0100
Last State: Terminated
Reason: Unknown
Exit Code: 255
Started: Tue, 16 Aug 2022 20:33:33 +0100
Finished: Wed, 17 Aug 2022 18:06:12 +0100
Ready: True
Restart Count: 1
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-n5tb7 (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-n5tb7:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SandboxChanged 43m kubelet Pod sandbox changed, it will be killed and re-created.
Normal Pulled 43m kubelet Container image "ghcr.io/home-assistant/home-assistant:stable" already present on machine
Normal Created 43m kubelet Created container home-assistant
Normal Started 43m kubelet Started container home-assistant
The pod is listening at port 8123
root#rpi1:~# kubectl exec -it -n home-assistant home-assistant-deploy-7c4674b679-zbwn7 -- netstat -plnt
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:8123 0.0.0.0:* LISTEN 60/python3
tcp6 0 0 :::8123 :::* LISTEN 60/python3
Deployment:
root#rpi1:~# kubectl describe deployments.apps -n home-assistant
Name: home-assistant-deploy
Namespace: home-assistant
CreationTimestamp: Tue, 16 Aug 2022 20:31:28 +0100
Labels: app=home-assistant
Annotations: deployment.kubernetes.io/revision: 1
Selector: app=home-assistant
Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=home-assistant
Containers:
home-assistant:
Image: ghcr.io/home-assistant/home-assistant:stable
Port: <none>
Host Port: <none>
Environment: <none>
Mounts: <none>
Volumes: <none>
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing True NewReplicaSetAvailable
OldReplicaSets: <none>
NewReplicaSet: home-assistant-deploy-7c4674b679 (1/1 replicas created)
Events: <none>
Service with port set to 8080 and target port to 8123:
root#rpi1:~# kubectl describe svc -n home-assistant home-assistant-service
Name: home-assistant-service
Namespace: home-assistant
Labels: app=home-assistant
Annotations: <none>
Selector: app=home-assistant
Type: LoadBalancer
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.43.248.75
IPs: 10.43.248.75
LoadBalancer Ingress: 192.168.0.53, 192.168.0.56, 192.168.0.57, 192.168.0.58
Port: <unset> 8080/TCP
TargetPort: 8123/TCP
NodePort: <unset> 31678/TCP
Endpoints: 10.42.3.7:8123
Session Affinity: None
External Traffic Policy: Cluster
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal UpdatedIngressIP 20h svccontroller LoadBalancer Ingress IP addresses updated: 192.168.0.53, 192.168.0.56, 192.168.0.58
Normal UpdatedIngressIP 20h (x2 over 22h) svccontroller LoadBalancer Ingress IP addresses updated: 192.168.0.53, 192.168.0.56, 192.168.0.57, 192.168.0.58
Normal AppliedDaemonSet 20h (x19 over 22h) svccontroller Applied LoadBalancer DaemonSet kube-system/svclb-home-assistant-service-f2675711
Normal UpdatedIngressIP 47m svccontroller LoadBalancer Ingress IP addresses updated: 192.168.0.53, 192.168.0.56
Normal UpdatedIngressIP 47m svccontroller LoadBalancer Ingress IP addresses updated: 192.168.0.53, 192.168.0.56, 192.168.0.57
Normal UpdatedIngressIP 47m svccontroller LoadBalancer Ingress IP addresses updated: 192.168.0.53, 192.168.0.56, 192.168.0.57, 192.168.0.58
Normal AppliedDaemonSet 47m (x8 over 47m) svccontroller Applied LoadBalancer DaemonSet kube-system/svclb-home-assistant-service-f2675711
My Ingress:
root#rpi1:~# kubectl describe ingress -n home-assistant home-assistant-ingress
Name: home-assistant-ingress
Labels: <none>
Namespace: home-assistant
Address: 192.168.0.57
Ingress Class: nginx
Default backend: <default>
Rules:
Host Path Backends
---- ---- --------
smart.home
/ home-assistant-service:8080 (10.42.3.7:8123)
Annotations: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Sync 19h (x2 over 19h) nginx-ingress-controller Scheduled for sync
Normal Sync 49m (x3 over 50m) nginx-ingress-controller Scheduled for sync
root#rpi1:~# kubectl get ingress -n home-assistant home-assistant-ingress
NAME CLASS HOSTS ADDRESS PORTS AGE
home-assistant-ingress nginx smart.home 192.168.0.57 80 19h
Can confirm I have Nginx ingress controller running:
root#rpi1:~# kubectl get pod -n ingress-nginx
NAME READY STATUS RESTARTS AGE
ingress-nginx-admission-create-2thj7 0/1 Completed 0 22h
ingress-nginx-admission-patch-kwm4m 0/1 Completed 1 22h
ingress-nginx-controller-6dc865cd86-9h8wt 1/1 Running 2 (52m ago) 22h
Ingress Nginx Controller log
root#rpi1:~# kubectl logs -n ingress-nginx ingress-nginx-controller-6dc865cd86-9h8wt
-------------------------------------------------------------------------------
NGINX Ingress controller
Release: v1.3.0
Build: 2b7b74854d90ad9b4b96a5011b9e8b67d20bfb8f
Repository: https://github.com/kubernetes/ingress-nginx
nginx version: nginx/1.19.10
-------------------------------------------------------------------------------
W0818 06:51:52.008386 7 client_config.go:617] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
I0818 06:51:52.009962 7 main.go:230] "Creating API client" host="https://10.43.0.1:443"
I0818 06:51:52.123762 7 main.go:274] "Running in Kubernetes cluster" major="1" minor="24" git="v1.24.3+k3s1" state="clean" commit="990ba0e88c90f8ed8b50e0ccd375937b841b176e" platform="linux/arm64"
I0818 06:51:52.594773 7 main.go:104] "SSL fake certificate created" file="/etc/ingress-controller/ssl/default-fake-certificate.pem"
I0818 06:51:52.691571 7 ssl.go:531] "loading tls certificate" path="/usr/local/certificates/cert" key="/usr/local/certificates/key"
I0818 06:51:52.773089 7 nginx.go:258] "Starting NGINX Ingress controller"
I0818 06:51:52.807863 7 event.go:285] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"ingress-nginx", Name:"ingress-nginx-controller", UID:"21ae6485-bb0e-447e-b098-c510e43b171e", APIVersion:"v1", ResourceVersion:"934", FieldPath:""}): type: 'Normal' reason: 'CREATE' ConfigMap ingress-nginx/ingress-nginx-controller
I0818 06:51:53.912887 7 store.go:429] "Found valid IngressClass" ingress="home-assistant/home-assistant-ingress" ingressclass="nginx"
I0818 06:51:53.913414 7 event.go:285] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"home-assistant", Name:"home-assistant-ingress", UID:"eeb12441-9cd4-4571-b0da-5b2978ff3267", APIVersion:"networking.k8s.io/v1", ResourceVersion:"8719", FieldPath:""}): type: 'Normal' reason: 'Sync' Scheduled for sync
I0818 06:51:53.975141 7 nginx.go:301] "Starting NGINX process"
I0818 06:51:53.975663 7 leaderelection.go:248] attempting to acquire leader lease ingress-nginx/ingress-controller-leader...
I0818 06:51:53.976173 7 nginx.go:321] "Starting validation webhook" address=":8443" certPath="/usr/local/certificates/cert" keyPath="/usr/local/certificates/key"
I0818 06:51:53.980492 7 controller.go:167] "Configuration changes detected, backend reload required"
I0818 06:51:54.025524 7 leaderelection.go:258] successfully acquired lease ingress-nginx/ingress-controller-leader
I0818 06:51:54.025924 7 status.go:84] "New leader elected" identity="ingress-nginx-controller-6dc865cd86-9h8wt"
I0818 06:51:54.039912 7 status.go:214] "POD is not ready" pod="ingress-nginx/ingress-nginx-controller-6dc865cd86-9h8wt" node="rpi3"
I0818 06:51:54.051540 7 status.go:299] "updating Ingress status" namespace="home-assistant" ingress="home-assistant-ingress" currentValue=[{IP:192.168.0.57 Hostname: Ports:[]}] newValue=[]
I0818 06:51:54.071502 7 event.go:285] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"home-assistant", Name:"home-assistant-ingress", UID:"eeb12441-9cd4-4571-b0da-5b2978ff3267", APIVersion:"networking.k8s.io/v1", ResourceVersion:"14445", FieldPath:""}): type: 'Normal' reason: 'Sync' Scheduled for sync
I0818 06:51:54.823911 7 controller.go:184] "Backend successfully reloaded"
I0818 06:51:54.824200 7 controller.go:195] "Initial sync, sleeping for 1 second"
I0818 06:51:54.824334 7 event.go:285] Event(v1.ObjectReference{Kind:"Pod", Namespace:"ingress-nginx", Name:"ingress-nginx-controller-6dc865cd86-9h8wt", UID:"def1db3a-4766-4751-b611-ae3461911bc6", APIVersion:"v1", ResourceVersion:"14423", FieldPath:""}): type: 'Normal' reason: 'RELOAD' NGINX reload triggered due to a change in configuration
W0818 06:51:57.788759 7 controller.go:1111] Service "home-assistant/home-assistant-service" does not have any active Endpoint.
I0818 06:52:54.165805 7 status.go:299] "updating Ingress status" namespace="home-assistant" ingress="home-assistant-ingress" currentValue=[] newValue=[{IP:192.168.0.57 Hostname: Ports:[]}]
I0818 06:52:54.190556 7 event.go:285] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"home-assistant", Name:"home-assistant-ingress", UID:"eeb12441-9cd4-4571-b0da-5b2978ff3267", APIVersion:"networking.k8s.io/v1", ResourceVersion:"14590", FieldPath:""}): type: 'Normal' reason: 'Sync' Scheduled for sync
Endpoints
root#rpi1:~# kubectl get endpoints -A
NAMESPACE NAME ENDPOINTS AGE
default kubernetes 192.168.0.53:6443 35h
kube-system kube-dns 10.42.0.12:53,10.42.0.12:53,10.42.0.12:9153 35h
home-assistant home-assistant-service 10.42.3.9:8123 35h
kube-system metrics-server 10.42.0.14:4443 35h
ingress-nginx ingress-nginx-controller-admission 10.42.2.13:8443 35h
ingress-nginx ingress-nginx-controller 10.42.2.13:443,10.42.2.13:80 35h
Can also confirm the Traefik Ingress controller is disabled
root#rpi1:~# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
ingress-nginx ingress-nginx-admission-create-2thj7 0/1 Completed 0 22h
ingress-nginx ingress-nginx-admission-patch-kwm4m 0/1 Completed 1 22h
kube-system local-path-provisioner-7b7dc8d6f5-jcm4p 1/1 Running 1 (59m ago) 22h
kube-system svclb-home-assistant-service-f2675711-w88fv 1/1 Running 1 (59m ago) 22h
kube-system coredns-b96499967-rml6k 1/1 Running 1 (59m ago) 22h
kube-system svclb-home-assistant-service-f2675711-rv8rf 1/1 Running 1 (59m ago) 22h
kube-system svclb-home-assistant-service-f2675711-9qk8m 1/1 Running 2 (59m ago) 22h
kube-system svclb-home-assistant-service-f2675711-m62sl 1/1 Running 1 (59m ago) 22h
home-assistant home-assistant-deploy-7c4674b679-zbwn7 1/1 Running 1 (59m ago) 22h
kube-system metrics-server-668d979685-rp2wm 1/1 Running 1 (59m ago) 22h
ingress-nginx ingress-nginx-controller-6dc865cd86-9h8wt 1/1 Running 2 (59m ago) 22h
Ingress Nginx Controller Service:
root#rpi1:~# kubectl get svc -n ingress-nginx
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ingress-nginx-controller NodePort 10.43.254.114 <none> 80:32313/TCP,443:31543/TCP 23h
ingress-nginx-controller-admission ClusterIP 10.43.135.213 <none> 443/TCP 23h
root#rpi1:~# kubectl describe svc -n ingress-nginx ingress-nginx-controller
Name: ingress-nginx-controller
Namespace: ingress-nginx
Labels: app.kubernetes.io/component=controller
app.kubernetes.io/instance=ingress-nginx
app.kubernetes.io/name=ingress-nginx
app.kubernetes.io/part-of=ingress-nginx
app.kubernetes.io/version=1.3.0
Annotations: <none>
Selector: app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
Type: NodePort
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.43.254.114
IPs: 10.43.254.114
Port: http 80/TCP
TargetPort: http/TCP
NodePort: http 32313/TCP
Endpoints: 10.42.2.10:80
Port: https 443/TCP
TargetPort: https/TCP
NodePort: https 31543/TCP
Endpoints: 10.42.2.10:443
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>
Updated: added Ingress Nginx Controller service
Updated2: added Ingress Nginx Controller log and endpoints

Getting prometheus/grafana and k3s to work together

T learn kubernetes I've built myself a bare metal cluster using 4 Raspberry PIs set it up using k3s:
# curl -sfL https://get.k3s.io | sh -
Added nodes etc., and everything comes up and I can see all the nodes and almost everything is working as expected.
I wanted to monitor the PIs so I installed the kube-prometheus-stack with helm:
$ kubectl create namespace monitoring
$ helm install prometheus --namespace monitoring prometheus-community/kube-prometheus-stack
And now everything looks fantastic:
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system helm-install-traefik-crd-s8zw5 0/1 Completed 0 5d21h
kube-system helm-install-traefik-rc9f2 0/1 Completed 1 5d21h
monitoring prometheus-prometheus-node-exporter-j85rw 1/1 Running 10 28h
kube-system metrics-server-86cbb8457f-mvbkl 1/1 Running 12 5d21h
kube-system coredns-7448499f4d-t7sp8 1/1 Running 13 5d21h
monitoring prometheus-prometheus-node-exporter-mmh2q 1/1 Running 9 28h
monitoring prometheus-prometheus-node-exporter-j4k4c 1/1 Running 10 28h
monitoring alertmanager-prometheus-kube-prometheus-alertmanager-0 2/2 Running 10 28h
kube-system svclb-traefik-zkqd6 2/2 Running 6 19h
monitoring prometheus-prometheus-node-exporter-bft5t 1/1 Running 10 28h
kube-system local-path-provisioner-5ff76fc89d-g8tm6 1/1 Running 12 5d21h
kube-system svclb-traefik-jcxd2 2/2 Running 28 5d21h
kube-system svclb-traefik-mpbjm 2/2 Running 22 5d21h
kube-system svclb-traefik-7kxtw 2/2 Running 20 5d21h
monitoring prometheus-grafana-864598fd54-9548l 2/2 Running 10 28h
kube-system traefik-65969d48c7-9lh9m 1/1 Running 3 19h
monitoring prometheus-prometheus-kube-prometheus-prometheus-0 2/2 Running 10 28h
monitoring prometheus-kube-state-metrics-76f66976cb-m8k2h 1/1 Running 6 28h
monitoring prometheus-kube-prometheus-operator-5c758db547-zsv4s 1/1 Running 6 28h
The services are all there:
$ kubectl get services --all-namespaces
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes ClusterIP 10.43.0.1 <none> 443/TCP 5d21h
kube-system kube-dns ClusterIP 10.43.0.10 <none> 53/UDP,53/TCP,9153/TCP 5d21h
kube-system metrics-server ClusterIP 10.43.80.65 <none> 443/TCP 5d21h
kube-system prometheus-kube-prometheus-kube-proxy ClusterIP None <none> 10249/TCP 28h
kube-system prometheus-kube-prometheus-kube-scheduler ClusterIP None <none> 10251/TCP 28h
monitoring prometheus-kube-prometheus-operator ClusterIP 10.43.180.73 <none> 443/TCP 28h
kube-system prometheus-kube-prometheus-coredns ClusterIP None <none> 9153/TCP 28h
kube-system prometheus-kube-prometheus-kube-etcd ClusterIP None <none> 2379/TCP 28h
kube-system prometheus-kube-prometheus-kube-controller-manager ClusterIP None <none> 10252/TCP 28h
monitoring prometheus-kube-prometheus-alertmanager ClusterIP 10.43.195.99 <none> 9093/TCP 28h
monitoring prometheus-prometheus-node-exporter ClusterIP 10.43.171.218 <none> 9100/TCP 28h
monitoring prometheus-grafana ClusterIP 10.43.20.165 <none> 80/TCP 28h
monitoring prometheus-kube-prometheus-prometheus ClusterIP 10.43.207.29 <none> 9090/TCP 28h
monitoring prometheus-kube-state-metrics ClusterIP 10.43.229.14 <none> 8080/TCP 28h
kube-system prometheus-kube-prometheus-kubelet ClusterIP None <none> 10250/TCP,10255/TCP,4194/TCP 28h
monitoring alertmanager-operated ClusterIP None <none> 9093/TCP,9094/TCP,9094/UDP 28h
monitoring prometheus-operated ClusterIP None <none> 9090/TCP 28h
kube-system traefik LoadBalancer 10.43.20.17 192.168.76.200,192.168.76.201,192.168.76.202,192.168.76.203 80:31131/TCP,443:31562/TCP 5d21h
Namespaces:
$ kubectl get namespaces
NAME STATUS AGE
kube-system Active 5d21h
default Active 5d21h
kube-public Active 5d21h
kube-node-lease Active 5d21h
monitoring Active 28h
But I couldn't reach the grafana service.
Fair enough I thought, let's define an Ingress but it didn't work:
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: grafana-ingress
namespace: monitoring
annotations:
kubernetes.io/ingress.class: traefik
spec:
rules:
- http:
paths:
- pathType: Prefix
path: /
backend:
service:
name: prometheus-grafana
port:
number: 80
I have no idea why it isn't getting to the service and I can't really see where the problem is, although I understand containers, etc. (I first had everything running on docker swarm), I don't really know where, if anywhere, it would be shown in the logs.
I've spent the past couple of days trying all sorts of things and I finally found a hint about name spaces and problems calling services and something called "type: ExternalName".
I checked with curl from a pod inside the cluster and it is delivering the data inside of the "monitoring" name space but traefik can't get there or maybe even see it?
Having looked at the Traefik documentation I found this regarding namespaces but I have no idea where I would start to find the mentioned:
providers:
kubernetesCRD:
namespaces:
I'm assuming that k3s has set this up correctly as an empty array because I can't find anything on their site that tells me what to do with their combination of "klipper-lb" and "traefik".
I finally tried to define another service with an external name:
---
apiVersion: v1
kind: Service
metadata:
name: grafana-named
namespace: kube-system
spec:
type: ExternalName
externalName: prometheus-grafana.monitoring.svc.cluster.local
ports:
- name: service
protocol: TCP
port: 80
targetPort: 80
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: grafana-ingress
namespace: kube-system
annotations:
kubernetes.io/ingress.class: traefik
spec:
rules:
- http:
paths:
- pathType: Prefix
path: /
backend:
service:
name: grafana-named
port:
number: 80
After 2-3 days, I've tried everything I can think of, google everything under the sun and I can't get to grafana from outside of the internal cluster nodes.
I am at a loss as to how I can make anything work with k3s. I installed Lens on my main PC and can see almost everything there, but I think that the missing metrics information requires an Ingress or something like that too.
What do I have to do to get traefik to do what I think is basically it's job, route incoming requests to the backend services?
I filed a bug report on github and one of the people there (thanks again brandond) pointed me in the right direction.
The network layer uses flannel to process the "in cluster" networking. The default implementation for that is something called "vxlan" and that is seemingly more complex with virtual ethernet adapters.
For my requirements (read: getting the cluster to even work), the solution was to change the implementation to "host-gw".
This is done by adding "--flannel-backend=host-gw" to the k3s.service option on the controller.
$ sudo systemctl edit k3s.service
### Editing /etc/systemd/system/k3s.service.d/override.conf
### Anything between here and the comment below will become the new contents of the file
[Service]
ExecStart=
ExecStart=/usr/local/bin/k3s \
server \
'--flannel-backend=host-gw'
### Lines below this comment will be discarded
The first "ExecStart=" clears the existing default start command to enable it to be replaced by the 2nd one.
Now everything is working as I expected, and I can finally move forward with learning K8s.
I'll probably reactivate "vxlan" at some point and figure that out too.

Cannot access external IP of Load Balancer in a Kubernetes cluster

I created a load balancer service and the describe command returns the following:
Name: minio-service
Namespace: minio
Labels: app=minio
Annotations: <none>
Selector: app=minio
Type: LoadBalancer
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.43.150.49
IPs: 10.43.151.50
LoadBalancer Ingress: 192.168.31.12, 192.168.32.13, 192.168.33.14
Port: <unset> 9012/TCP
TargetPort: 9011/TCP
NodePort: <unset> 30777/TCP
Endpoints: 10.42.10.110:9011,10.42.10.111:9011
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>
If I try curl http://192.168.31.12:9012 it returns:
curl: (7) Failed to connect to 192.168.31.12 port 9012: Connection
timed out
Furthermore, I observed something strange.
kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
antonis-dell Ready control-plane,master 4h42m v1.21.2+k3s1 192.168.31.12 <none> Ubuntu 18.04.1 LTS 4.15.0-147-generic containerd://1.4.4-k3s2
knodea Ready <none> 4h9m v1.21.2+k3s1 192.168.32.13 <none> Raspbian GNU/Linux 10 (buster) 5.10.17-v7l+ containerd://1.4.4-k3s2
knodeb Ready <none> 4h2m v1.21.2+k3s1 192.168.33.14 <none> Raspbian GNU/Linux 10 (buster) 5.4.51-v7l+ containerd://1.4.4-k3s2
which means that LoadBalancer Ingress ips are the same with internal ips of nodes in the cluster.
Does anyone know why I have three LoadBalancer Ingress ips which are the same as the internal node ips and how to fix this?

Not able to connect to kafka brokers

I've deployed https://github.com/confluentinc/cp-helm-charts/tree/master/charts/cp-kafka on my on prem k8s cluster.
I'm trying to expose it my using a TCP controller with nginx.
My TCP nginx configmap looks like
data:
"<zookeper-tcp-port>": <namespace>/cp-zookeeper:2181
"<kafka-tcp-port>": <namespace>/cp-kafka:9092
And i've made the corresponding entry in my nginx ingress controller
- name: <zookeper-tcp-port>-tcp
port: <zookeper-tcp-port>
protocol: TCP
targetPort: <zookeper-tcp-port>-tcp
- name: <kafka-tcp-port>-tcp
port: <kafka-tcp-port>
protocol: TCP
targetPort: <kafka-tcp-port>-tcp
Now I'm trying to connect to my kafka instance.
When i just try to connect to the IP and port using kafka tools, I get the error message
Unable to determine broker endpoints from Zookeeper.
One or more brokers have multiple endpoints for protocol PLAIN...
Please proved bootstrap.servers value in advanced settings
[<cp-broker-address-0>.cp-kafka-headless.<namespace>:<port>][<ip>]
When I enter, what I assume are the correct broker addresses (I've tried them all...) I get a time out. There are no logs coming from the nginx controler excep
[08/Apr/2020:15:51:12 +0000]TCP200000.000
[08/Apr/2020:15:51:12 +0000]TCP200000.000
[08/Apr/2020:15:51:14 +0000]TCP200000.001
From the pod kafka-zookeeper-0 I'm gettting loads of
[2020-04-08 15:52:02,415] INFO Accepted socket connection from /<ip:port> (org.apache.zookeeper.server.NIOServerCnxnFactory)
[2020-04-08 15:52:02,415] WARN Unable to read additional data from client sessionid 0x0, likely client has closed socket (org.apache.zookeeper.server.NIOServerCnxn)
[2020-04-08 15:52:02,415] INFO Closed socket connection for client /<ip:port> (no session established for client) (org.apache.zookeeper.server.NIOServerCnxn)
Though I'm not sure these have anything to do with it?
Any ideas on what I'm doing wrong?
Thanks in advance.
TL;DR:
Change the value nodeport.enabled to true inside cp-kafka/values.yaml before deploying.
Change the service name and ports in you TCP NGINX Configmap and Ingress object.
Set bootstrap-server on your kafka tools to <Cluster_External_IP>:31090
Explanation:
The Headless Service was created alongside the StatefulSet. The created service will not be given a clusterIP, but will instead simply include a list of Endpoints.
These Endpoints are then used to generate instance-specific DNS records in the form of:
<StatefulSet>-<Ordinal>.<Service>.<Namespace>.svc.cluster.local
It creates a DNS name for each pod, e.g:
[ root#curl:/ ]$ nslookup my-confluent-cp-kafka-headless
Server: 10.0.0.10
Address 1: 10.0.0.10 kube-dns.kube-system.svc.cluster.local
Name: my-confluent-cp-kafka-headless
Address 1: 10.8.0.23 my-confluent-cp-kafka-1.my-confluent-cp-kafka-headless.default.svc.cluster.local
Address 2: 10.8.1.21 my-confluent-cp-kafka-0.my-confluent-cp-kafka-headless.default.svc.cluster.local
Address 3: 10.8.3.7 my-confluent-cp-kafka-2.my-confluent-cp-kafka-headless.default.svc.cluster.local
This is what makes this services connect to each other inside the cluster.
I've gone through a lot of trial and error, until I realized how it was supposed to be working. Based your TCP Nginx Configmap I believe you faced the same issue.
The Nginx ConfigMap asks for: <PortToExpose>: "<Namespace>/<Service>:<InternallyExposedPort>".
I realized that you don't need to expose the Zookeeper, since it's a internal service and handled by kafka brokers.
I also realized that you are trying to expose cp-kafka:9092 which is the headless service, also only used internally, as I explained above.
In order to get outside access you have to set the parameters nodeport.enabled to true as stated here: External Access Parameters.
It adds one service to each kafka-N pod during chart deployment.
Then you change your configmap to map to one of them:
data:
"31090": default/demo-cp-kafka-0-nodeport:31090
Note that the service created has the selector statefulset.kubernetes.io/pod-name: demo-cp-kafka-0 this is how the service identifies the pod it is intended to connect to.
Edit the nginx-ingress-controller:
- containerPort: 31090
hostPort: 31090
protocol: TCP
Set your kafka tools to <Cluster_External_IP>:31090
Reproduction:
- Snippet edited in cp-kafka/values.yaml:
nodeport:
enabled: true
servicePort: 19092
firstListenerPort: 31090
Deploy the chart:
$ helm install demo cp-helm-charts
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
demo-cp-control-center-6d79ddd776-ktggw 1/1 Running 3 113s
demo-cp-kafka-0 2/2 Running 1 113s
demo-cp-kafka-1 2/2 Running 0 94s
demo-cp-kafka-2 2/2 Running 0 84s
demo-cp-kafka-connect-79689c5c6c-947c4 2/2 Running 2 113s
demo-cp-kafka-rest-56dfdd8d94-79kpx 2/2 Running 1 113s
demo-cp-ksql-server-c498c9755-jc6bt 2/2 Running 2 113s
demo-cp-schema-registry-5f45c498c4-dh965 2/2 Running 3 113s
demo-cp-zookeeper-0 2/2 Running 0 112s
demo-cp-zookeeper-1 2/2 Running 0 93s
demo-cp-zookeeper-2 2/2 Running 0 74s
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
demo-cp-control-center ClusterIP 10.0.13.134 <none> 9021/TCP 50m
demo-cp-kafka ClusterIP 10.0.15.71 <none> 9092/TCP 50m
demo-cp-kafka-0-nodeport NodePort 10.0.7.101 <none> 19092:31090/TCP 50m
demo-cp-kafka-1-nodeport NodePort 10.0.4.234 <none> 19092:31091/TCP 50m
demo-cp-kafka-2-nodeport NodePort 10.0.3.194 <none> 19092:31092/TCP 50m
demo-cp-kafka-connect ClusterIP 10.0.3.217 <none> 8083/TCP 50m
demo-cp-kafka-headless ClusterIP None <none> 9092/TCP 50m
demo-cp-kafka-rest ClusterIP 10.0.14.27 <none> 8082/TCP 50m
demo-cp-ksql-server ClusterIP 10.0.7.150 <none> 8088/TCP 50m
demo-cp-schema-registry ClusterIP 10.0.7.84 <none> 8081/TCP 50m
demo-cp-zookeeper ClusterIP 10.0.9.119 <none> 2181/TCP 50m
demo-cp-zookeeper-headless ClusterIP None <none> 2888/TCP,3888/TCP 50m
Create the TCP configmap:
$ cat nginx-tcp-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: tcp-services
namespace: kube-system
data:
31090: "default/demo-cp-kafka-0-nodeport:31090"
$ kubectl apply -f nginx-tcp.configmap.yaml
configmap/tcp-services created
Edit the Nginx Ingress Controller:
$ kubectl edit deploy nginx-ingress-controller -n kube-system
$kubectl get deploy nginx-ingress-controller -n kube-system -o yaml
{{{suppressed output}}}
ports:
- containerPort: 31090
hostPort: 31090
protocol: TCP
- containerPort: 80
name: http
protocol: TCP
- containerPort: 443
name: https
protocol: TCP
My ingress is on IP 35.226.189.123, now let's try to connect from outside the cluster. For that I'll connect to another VM where I have a minikube, so I can use kafka-client pod to test:
user#minikube:~$ kubectl get pods
NAME READY STATUS RESTARTS AGE
kafka-client 1/1 Running 0 17h
user#minikube:~$ kubectl exec kafka-client -it -- bin/bash
root#kafka-client:/# kafka-console-consumer --bootstrap-server 35.226.189.123:31090 --topic demo-topic --from-beginning --timeout-ms 8000 --max-messages 1
Wed Apr 15 18:19:48 UTC 2020
Processed a total of 1 messages
root#kafka-client:/#
As you can see, I was able to access the kafka from outside.
If you need external access to Zookeeper as well I'll leave a service model for you:
zookeeper-external-0.yaml
apiVersion: v1
kind: Service
metadata:
labels:
app: cp-zookeeper
pod: demo-cp-zookeeper-0
name: demo-cp-zookeeper-0-nodeport
namespace: default
spec:
externalTrafficPolicy: Cluster
ports:
- name: external-broker
nodePort: 31181
port: 12181
protocol: TCP
targetPort: 31181
selector:
app: cp-zookeeper
statefulset.kubernetes.io/pod-name: demo-cp-zookeeper-0
sessionAffinity: None
type: NodePort
It will create a service for it:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
demo-cp-zookeeper-0-nodeport NodePort 10.0.5.67 <none> 12181:31181/TCP 2s
Patch your configmap:
data:
"31090": default/demo-cp-kafka-0-nodeport:31090
"31181": default/demo-cp-zookeeper-0-nodeport:31181
Add the Ingress rule:
ports:
- containerPort: 31181
hostPort: 31181
protocol: TCP
Test it with your external IP:
pod/zookeeper-client created
user#minikube:~$ kubectl exec -it zookeeper-client -- /bin/bash
root#zookeeper-client:/# zookeeper-shell 35.226.189.123:31181
Connecting to 35.226.189.123:31181
Welcome to ZooKeeper!
JLine support is disabled
If you have any doubts, let me know in the comments!

Kubernetes ExternalName service not visible in DNS

I'm trying to expose a single database instance as a service in two Kubernetes namespaces. Kubernetes version 1.11.3 running on Ubuntu 16.04.1. The database service is visible and working in the default namespace. I created an ExternalName service in a non-default namespace referencing the fully qualified domain name in the default namespace as follows:
kind: Service
apiVersion: v1
metadata:
name: ws-mysql
namespace: wittlesouth
spec:
type: ExternalName
externalName: mysql.default.svc.cluster.local
ports:
- port: 3306
The service is running:
eric$ kubectl describe service ws-mysql --namespace=wittlesouth
Name: ws-mysql
Namespace: wittlesouth
Labels: <none>
Annotations: <none>
Selector: <none>
Type: ExternalName
IP:
External Name: mysql.default.svc.cluster.local
Port: <unset> 3306/TCP
TargetPort: 3306/TCP
Endpoints: <none>
Session Affinity: None
Events: <none>
If I check whether the service can be found by name from a pod running in the wittlesouth namespace, this service name does not resolve, but other services in that namespace (i.e. Jira) do:
root#rs-ws-diags-8mgqq:/# nslookup mysql.default.svc.cluster.local
Server: 10.96.0.10
Address: 10.96.0.10#53
Name: mysql.default.svc.cluster.local
Address: 10.99.120.208
root#rs-ws-diags-8mgqq:/# nslookup ws-mysql.wittlesouth
Server: 10.96.0.10
Address: 10.96.0.10#53
*** Can't find ws-mysql.wittlesouth: No answer
root#rs-ws-diags-8mgqq:/# nslookup ws-mysql
Server: 10.96.0.10
Address: 10.96.0.10#53
*** Can't find ws-mysql: No answer
root#rs-ws-diags-8mgqq:/# nslookup ws-mysql.wittlesouth
Server: 10.96.0.10
Address: 10.96.0.10#53
*** Can't find ws-mysql.wittlesouth: No answer
root#rs-ws-diags-8mgqq:/# nslookup ws-mysql.wittlesouth.svc.cluster.local
Server: 10.96.0.10
Address: 10.96.0.10#53
*** Can't find ws-mysql.wittlesouth.svc.cluster.local: No answer
root#rs-ws-diags-8mgqq:/# nslookup ws-mysql.wittlesouth
Server: 10.96.0.10
Address: 10.96.0.10#53
*** Can't find ws-mysql.wittlesouth: No answer
root#rs-ws-diags-8mgqq:/# nslookup jira.wittlesouth
Server: 10.96.0.10
Address: 10.96.0.10#53
Name: jira.wittlesouth.svc.cluster.local
Address: 10.105.30.239
Any thoughts on what might be the issue here? For the moment I've worked around it by updating applications that need to use the database to reference the fully qualified domain name of the service running in the default namespace, but I'd prefer to avoid that. My intent eventually is to have the namespaces have separate database instances, and would like to deploy apps configured to work that way now in advance of actually standing up the second instance.
This doesn't work for me with Kubernetes 1.11.2 with coredns and calico. It works only if you reference the external service directly in whichever namespace it runs:
$ kubectl get pods -n default
NAME READY STATUS RESTARTS AGE
mysql-0 2/2 Running 0 17m
mysql-1 2/2 Running 0 16m
$ kubectl get pods -n wittlesouth
NAME READY STATUS RESTARTS AGE
ricos-dummy-pod 1/1 Running 0 14s
kubectl exec -it ricos-dummy-pod -n wittlesouth bash
root#ricos-dummy-pod:/# ping mysql.default.svc.cluster.local
PING mysql.default.svc.cluster.local (192.168.1.40): 56 data bytes
64 bytes from 192.168.1.40: icmp_seq=0 ttl=62 time=0.578 ms
64 bytes from 192.168.1.40: icmp_seq=1 ttl=62 time=0.632 ms
64 bytes from 192.168.1.40: icmp_seq=2 ttl=62 time=0.628 ms
^C--- mysql.default.svc.cluster.local ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.578/0.613/0.632/0.025 ms
root#ricos-dummy-pod:/# ping ws-mysql
ping: unknown host
root#ricos-dummy-pod:/# exit
$ kubectl get svc mysql
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
mysql ClusterIP None <none> 3306/TCP 45d
$ kubectl describe svc mysql
Name: mysql
Namespace: default
Labels: app=mysql
Annotations: <none>
Selector: app=mysql
Type: ClusterIP
IP: None
Port: mysql 3306/TCP
TargetPort: 3306/TCP
Endpoints: 192.168.1.40:3306,192.168.2.25:3306
Session Affinity: None
Events: <none>
The ExternalName service feature is only supported using kube-dns as per the docs and Kubernetes 1.11.x defaults to coredns. You might want to try changing from coredns to kube-dns or possibly changing the configs for your coredns deployment. I expect this to available at some point using coredns.