Kubernetes metrics-server FailedDiscoveryCheck - kubernetes

was hoping to get a little help, my Google-Fu didnt get me much closer. I'm trying to install the metrics server for my fedora-coreos kubernetes 4 node cluster like so:
kubectl apply -f deploy/kubernetes/
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
serviceaccount/metrics-server created
deployment.apps/metrics-server created
service/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
the service seems to never start
kubectl describe apiservice v1beta1.metrics.k8s.io
Name: v1beta1.metrics.k8s.io
Namespace:
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"apiregistration.k8s.io/v1beta1","kind":"APIService","metadata":{"annotations":{},"name":"v1beta1.metrics.k8s.io"},"spec":{"...
API Version: apiregistration.k8s.io/v1
Kind: APIService
Metadata:
Creation Timestamp: 2020-03-04T16:53:33Z
Resource Version: 1611816
Self Link: /apis/apiregistration.k8s.io/v1/apiservices/v1beta1.metrics.k8s.io
UID: 65d9a56a-c548-4d7e-a647-8ce7a865a266
Spec:
Group: metrics.k8s.io
Group Priority Minimum: 100
Insecure Skip TLS Verify: true
Service:
Name: metrics-server
Namespace: kube-system
Port: 443
Version: v1beta1
Version Priority: 100
Status:
Conditions:
Last Transition Time: 2020-03-04T16:53:33Z
Message: failing or missing response from https://10.3.230.59:443/apis/metrics.k8s.io/v1beta1: bad status from https://10.3.230.59:443/apis/metrics.k8s.io/v1beta1: 403
Reason: FailedDiscoveryCheck
Status: False
Type: Available
Events: <none>
Diagnosing I have found googling around:
kubectl get deploy,svc -n kube-system |egrep metrics-server
deployment.apps/metrics-server 1/1 1 1 8m7s
service/metrics-server ClusterIP 10.3.230.59 <none> 443/TCP 8m7s
kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes"
Error from server (ServiceUnavailable): the server is currently unable to handle the request
kubectl get all --all-namespaces | grep -i metrics-server
kube-system pod/metrics-server-75b5d446cd-zj4jm 1/1 Running 0 9m11s
kube-system service/metrics-server ClusterIP 10.3.230.59 <none> 443/TCP 9m11s
kube-system deployment.apps/metrics-server 1/1 1 1 9m11s
kube-system replicaset.apps/metrics-server-75b5d446cd 1 1 1 9m11s
kubectl logs -f metrics-server-75b5d446cd-zj4jm -n kube-system
I0304 16:53:36.475657 1 serving.go:312] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
W0304 16:53:38.229267 1 authentication.go:296] Cluster doesn't provide requestheader-client-ca-file in configmap/extension-apiserver-authentication in kube-system, so request-header client certificate authentication won't work.
I0304 16:53:38.267760 1 secure_serving.go:116] Serving securely on [::]:4443
kubectl get -n kube-system deployment metrics-server -o yaml | grep -i args -A 10
{"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"labels":{"k8s-app":"metrics-server"},"name":"metrics-server","namespace":"kube-system"},"spec":{"selector":{"matchLabels":{"k8s-app":"metrics-server"}},"template":{"metadata":{"labels":{"k8s-app":"metrics-server"},"name":"metrics-server"},"spec":{"containers":[{"args":["--cert-dir=/tmp","--secure-port=4443","--kubelet-insecure-tls","--kubelet-preferred-address-types=InternalIP"],"image":"k8s.gcr.io/metrics-server-amd64:v0.3.6","imagePullPolicy":"IfNotPresent","name":"metrics-server","ports":[{"containerPort":4443,"name":"main-port","protocol":"TCP"}],"securityContext":{"readOnlyRootFilesystem":true,"runAsNonRoot":true,"runAsUser":1000},"volumeMounts":[{"mountPath":"/tmp","name":"tmp-dir"}]}],"nodeSelector":{"beta.kubernetes.io/os":"linux","kubernetes.io/arch":"amd64"},"serviceAccountName":"metrics-server","volumes":[{"emptyDir":{},"name":"tmp-dir"}]}}}}
creationTimestamp: "2020-03-04T16:53:33Z"
generation: 1
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
resourceVersion: "1611810"
selfLink: /apis/apps/v1/namespaces/kube-system/deployments/metrics-server
uid: 006e758e-bd33-47d7-8378-d3a8081ee8a8
spec:
--
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
image: k8s.gcr.io/metrics-server-amd64:v0.3.6
imagePullPolicy: IfNotPresent
name: metrics-server
ports:
- containerPort: 4443
name: main-port
finally my deployment config:
spec:
selector:
matchLabels:
k8s-app: metrics-server
template:
metadata:
name: metrics-server
labels:
k8s-app: metrics-server
spec:
serviceAccountName: metrics-server
volumes:
# mount in tmp so we can safely use from-scratch images and/or read-only containers
- name: tmp-dir
emptyDir: {}
containers:
- name: metrics-server
image: k8s.gcr.io/metrics-server-amd64:v0.3.6
command:
- /metrics-server
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
ports:
- name: main-port
containerPort: 4443
protocol: TCP
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
imagePullPolicy: IfNotPresent
volumeMounts:
- name: tmp-dir
mountPath: /tmp
hostNetwork: true
nodeSelector:
beta.kubernetes.io/os: linux
kubernetes.io/arch: "amd64"
I'm at a loss of what it could be getting the metrics service to start and just get the basic kubectl top node to display any info all I get is
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods.metrics.k8s.io)
I have searched the internet and tried adding the args: and command: lines but no luck
command:
- /metrics-server
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
Can anyone shed light on how to fix this? Thanks
Pastebin log file
Log File

I've reproduced your issue. I have used Calico as CNI.
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
fedora-master Ready master 6m27s v1.17.3
fedora-worker-1 Ready <none> 4m48s v1.17.3
fedora-worker-2 Ready <none> 4m46s v1.17.3
fedora-master:~/metrics-server$ kubectl describe apiservice v1beta1.metrics.k8s.io
Status:
Conditions:
Last Transition Time: 2020-03-12T16:04:59Z
Message: failing or missing response from https://10.99.122.196:443/apis/metrics.k8s.io/v
1beta1: Get https://10.99.122.196:443/apis/metrics.k8s.io/v1beta1: net/http: request canceled while waiting
for connection (Client.Timeout exceeded while awaiting headers)
fedora-master:~/metrics-server$ kubectl top pod
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods.metrics.k8s.io)
When you have only one node in cluster, default settings in metrics-server repo works correctly. Issue occurs when you have more than 2 nodes. Ive used 1 master and 2 workers to reproduce. Below example deployment which works correct (have all required args). Before, please remove your current metrics-server YAMLs (kubectl delete -f deploy/kubernetes) and execute:
$ git clone https://github.com/kubernetes-sigs/metrics-server
$ cd metrics-server/deploy/kubernetes/
$ vi metrics-server-deployment.yaml
Paste below YAML:
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: metrics-server
namespace: kube-system
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: metrics-server
namespace: kube-system
labels:
k8s-app: metrics-server
spec:
selector:
matchLabels:
k8s-app: metrics-server
template:
metadata:
name: metrics-server
labels:
k8s-app: metrics-server
spec:
serviceAccountName: metrics-server
volumes:
# mount in tmp so we can safely use from-scratch images and/or read-only containers
- name: tmp-dir
emptyDir: {}
hostNetwork: true
containers:
- name: metrics-server
image: k8s.gcr.io/metrics-server-amd64:v0.3.6
imagePullPolicy: IfNotPresent
args:
- /metrics-server
- --kubelet-preferred-address-types=InternalIP
- --kubelet-insecure-tls
- --cert-dir=/tmp
- --secure-port=4443
ports:
- name: main-port
containerPort: 4443
protocol: TCP
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
volumeMounts:
- name: tmp-dir
mountPath: /tmp
nodeSelector:
kubernetes.io/os: linux
kubernetes.io/arch: "amd64"
save and quit using :wq
$ cd ~/metrics-server
$ kubectl apply -f deploy/kubernetes/
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
serviceaccount/metrics-server created
deployment.apps/metrics-server created
service/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
Wait a while for metrics-server to gather a few metrics from nodes.
$ kubectl describe apiservice v1beta1.metrics.k8s.io
Name: v1beta1.metrics.k8s.io
Namespace:
...
Metadata:
Creation Timestamp: 2020-03-12T16:57:58Z
...
Spec:
Group: metrics.k8s.io
Group Priority Minimum: 100
Insecure Skip TLS Verify: true
Service:
Name: metrics-server
Namespace: kube-system
Port: 443
Version: v1beta1
Version Priority: 100
Status:
Conditions:
Last Transition Time: 2020-03-12T16:58:01Z
Message: all checks passed
Reason: Passed
Status: True
Type: Available
Events: <none>
after a few minutes you can use top.
$ kubectl top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
fedora-master 188m 9% 1315Mi 17%
fedora-worker-1 109m 5% 982Mi 13%
fedora-worker-2 84m 4% 969Mi 13%
If you will still encounter some issues, please add - --v=6 to deployment and provide logs from metrics-server pod.
containers:
- name: metrics-server
image: k8s.gcr.io/metrics-server-amd64:v0.3.1
args:
- /metrics-server
- --v=6
- --kubelet-preferred-address-types=InternalIP
- --kubelet-insecure-tls

You need to carefully check logs for calico-node pods. In my case i have some other network interfaces and the autodetection mechanism in calico was detecting wrong interface (ip address). You need to consult this documentation https://projectcalico.docs.tigera.io/reference/node/configuration.
What i did in my case, was simply:
kubectl set env daemonset/calico-node -n kube-system IP_AUTODETECTION_METHOD=cidr=172.16.8.0/24
cidr is my "working network". After this, all calico-nodes restarted and suddenly everything was fine.

Related

Some pods not resolving via CoreDNS in k3s

I am running k3s cluster on RPis with Ubuntu Server 20.04 and have a drive mounted onto the master node to serve as shared NFS storage.
For some reason I can only access the server using the ClusterIP of the service, but no the name. CoreDNS is running and the logs suggest that it's working fine, but it's not used by the test pod that I try to mount the NFS share onto.
There is a question for Raspbian case, but I am running Ubuntu, not sure if using legacy iptables as suggested there will work or break things further.
Any help is appreciated! Details below.
NFS server manifests:
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nfs-server-pod
namespace: storage
labels:
nfs-role: server
spec:
replicas: 1
strategy:
type: Recreate
selector:
matchLabels:
nfs-role: server
template:
metadata:
labels:
nfs-role: server
spec:
nodeSelector:
storage: enabled
containers:
- name: nfs-server-container
image: ghcr.io/two-tauers/nfs-server:0.1.14
securityContext:
privileged: true
args:
- /storage
volumeMounts:
- name: storage-mount
mountPath: /storage
volumes:
- name: storage-mount
hostPath:
path: /storage
---
kind: Service
apiVersion: v1
metadata:
name: nfs-server
namespace: storage
spec:
selector:
nfs-role: server
type: ClusterIP
ports:
- name: tcp-2049
port: 2049
protocol: TCP
- name: udp-111
port: 111
❯ kubectl logs pod/nfs-server-pod-ccd9c4877-lgd4x -n storage
* Exporting directories for NFS kernel daemon...
...done.
* Starting NFS kernel daemon
...done.
Setting up watches.
Watches established.
❯ kubectl get service/nfs-server -n storage
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
nfs-server ClusterIP 10.43.223.27 <none> 2049/TCP,111/TCP 102m
Test pod, not in a running state:
---
kind: Pod
apiVersion: v1
metadata:
name: test-nfs
namespace: storage
spec:
volumes:
- name: nfs-volume
nfs:
server: nfs-server # doesn't work
### server: nfs-server.storage.svc.cluster.local # doesn't work
### server: 10.43.223.27 # works
path: /
containers:
- name: test
image: alpine
volumeMounts:
- name: nfs-volume
mountPath: /var/nfs
command: ["/bin/sh"]
args: ["-c", "while true; do date >> /var/nfs/dates.txt; sleep 5; done"]
Events:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 51m default-scheduler Successfully assigned storage/test-nfs to sauron
Warning FailedMount 9m26s (x14 over 48m) kubelet MountVolume.SetUp failed for volume "nfs-volume" : mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t nfs nfs-server:/ /var/lib/kubelet/pods/d1523d5f-60ec-4990-ae61-369a26b7a6f4/volumes/kubernetes.io~nfs/nfs-volume
Output: mount.nfs: Connection timed out
Warning FailedMount 8m12s (x15 over 49m) kubelet Unable to attach or mount volumes: unmounted volumes=[nfs-volume], unattached volumes=[nfs-volume kube-api-access-pc9bk]: timed out waiting for the condition
Warning FailedMount 3m44s (x5 over 46m) kubelet Unable to attach or mount volumes: unmounted volumes=[nfs-volume], unattached volumes=[kube-api-access-pc9bk nfs-volume]: timed out waiting for the condition
At the same time, nslookup works and coredns does produce a lookup log:
---
apiVersion: v1
kind: Pod
metadata:
name: dnsutils
namespace: storage
spec:
containers:
- name: dnsutils
image: k8s.gcr.io/e2e-test-images/jessie-dnsutils:1.3
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
restartPolicy: Always
❯ kubectl exec -i -t dnsutils -n storage -- nslookup nfs-server && kubectl logs pod/coredns-84c56f7bfb-66s84 -n kube-system | tail -1
Server: 10.43.0.10
Address: 10.43.0.10#53
Name: nfs-server.storage.svc.cluster.local
Address: 10.43.223.27
[INFO] 10.42.1.77:37937 - 21822 "A IN nfs-server.storage.svc.cluster.local. udp 54 false 512" NOERROR qr,aa,rd 106 0.000980753s
❯ kubectl exec -i -t dnsutils -n storage -- nslookup kubernetes.default.svc.cluster.local && kubectl logs pod/coredns-84c56f7bfb-66s84 -n kube-system | tail -1
Server: 10.43.0.10
Address: 10.43.0.10#53
Name: kubernetes.default.svc.cluster.local
Address: 10.43.0.1
[INFO] 10.42.1.77:59077 - 58999 "A IN kubernetes.default.svc.cluster.local. udp 54 false 512" NOERROR qr,aa,rd 106 0.000496515s

Error: failed to prepare subPath for volumeMount "postgres-storage" of container "postgres"

I am trying to use persistent volume claims and facing this issue
This is my postgres-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres-deployment
spec:
replicas: 1
selector:
matchLabels:
component: postgres
template:
metadata:
labels:
component: postgres
spec:
volumes:
- name: postgres-storage
persistentVolumeClaim:
claimName: database-persistent-volume-claim
containers:
- name: postgres
image: postgres
ports:
- containerPort: 5432
volumeMounts:
- mountPath: /var/lib/postgresql/data
name: postgres-storage
subPath: postgres
when i debug pod using describe
kubectl describe pod postgres-deployment-8576df7bfc-8mp5t
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 3m4s default-scheduler Successfully assigned default/postgres-deployment-8576df7bfc-8mp5t to docker-desktop
Normal Pulled 67s (x8 over 2m58s) kubelet, docker-desktop Successfully pulled image "postgres"
Warning Failed 67s (x8 over 2m58s) kubelet, docker-desktop Error: failed to prepare subPath for volumeMount "postgres-storage" of container "postgres"
Normal Pulling 53s (x9 over 3m3s) kubelet, docker-desktop Pulling image "postgres"
My pod is showing me this error
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
postgres-deployment-8576df7bfc-8mp5t 0/1 CreateContainerConfigError 0 5m5
I am not sure where is the problem in the config. but I am sure this is related to volumes because after adding volumes this problem appears
remove subpath. can you try below yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres-deployment
spec:
replicas: 1
selector:
matchLabels:
component: postgres
template:
metadata:
labels:
component: postgres
spec:
volumes:
- name: postgres-storage
persistentVolumeClaim:
claimName: database-persistent-volume-claim
containers:
- name: postgres
image: postgres
ports:
- containerPort: 5432
volumeMounts:
- mountPath: /var/lib/postgresql/data
name: postgres-storage
I just deployed and it works
master $ kubectl get deploy
NAME READY UP-TO-DATE AVAILABLE AGE
postgres-deployment 1/1 1 1 4m13s
master $ kubectl get po
NAME READY STATUS RESTARTS AGE
postgres-deployment-6b66bdd748-5q76h 1/1 Running 0 4m13s

Specifying a role in a kubernetes cronjob

We have a kubernetes cluster with istio proxy running.
At first I created a cronjob which reads from database and updates a value if found. It worked fine.
Then it turns out we already had a service that does the database update so I changed the database code into a service call.
conn := dial.Service("service:3550", grpc.WithInsecure())
client := protobuf.NewServiceClient(conn)
client.Update(ctx)
But istio rejects the calls with an RBAC error. It just rejects and doesnt say why.
Is it possible to add a role to a cronjob? How can we do that?
The mTLS meshpolicy is PERMISSIVE.
Kubernetes version is 1.17 and istio version is 1.3
API Version: authentication.istio.io/v1alpha1
Kind: MeshPolicy
Metadata:
Creation Timestamp: 2019-12-05T16:06:08Z
Generation: 1
Resource Version: 6578
Self Link: /apis/authentication.istio.io/v1alpha1/meshpolicies/default
UID: 25f36b0f-1779-11ea-be8c-42010a84006d
Spec:
Peers:
Mtls:
Mode: PERMISSIVE
The cronjob yaml
Name: cronjob
Namespace: serve
Labels: <none>
Annotations: <none>
Schedule: */30 * * * *
Concurrency Policy: Allow
Suspend: False
Successful Job History Limit: 1
Failed Job History Limit: 3
Pod Template:
Labels: <none>
Containers:
service:
Image: service:latest
Port: <none>
Host Port: <none>
Environment:
JOB_NAME: (v1:metadata.name)
Mounts: <none>
Volumes: <none>
Last Schedule Time: Tue, 17 Dec 2019 09:00:00 +0100
Active Jobs: <none>
Events:
edit
I have turned off RBA for my namespace in ClusterRBACConfig and now it works. So cronjobs are affected by roles is my conclusion then and it should be possible to add a role and call other services.
The cronjob needs proper permissions in order to run if RBAC is enabled.
One of the solutions in this case would be to add a ServiceAccount to the cronjob configuration file that has enough privileges to execute what it needs to.
Since You already have existing services in the namespace You can check if You have existing ServiceAccount for specific NameSpace by using:
$ kubectl get serviceaccounts -n serve
If there is existing ServiceAccount You can add it into Your cronjob manifest yaml file.
Like in this example:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: adwords-api-scale-up-cron-job
spec:
schedule: "*/2 * * * *"
jobTemplate:
spec:
activeDeadlineSeconds: 100
template:
spec:
serviceAccountName: scheduled-autoscaler-service-account
containers:
- name: adwords-api-scale-up-container
image: bitnami/kubectl:1.15-debian-9
command:
- bash
args:
- "-xc"
- |
kubectl scale --replicas=2 --v=7 deployment/adwords-api-deployment
volumeMounts:
- name: kubectl-config
mountPath: /.kube/
readOnly: true
volumes:
- name: kubectl-config
hostPath:
path: $HOME/.kube # Replace $HOME with an evident path location
restartPolicy: OnFailure
Then under Pod Template there should be Service Account visable:
$ kubectl describe cronjob adwords-api-scale-up-cron-job
Name: adwords-api-scale-up-cron-job
Namespace: default
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"batch/v1beta1","kind":"CronJob","metadata":{"annotations":{},"name":"adwords-api-scale-up-cron-job","namespace":"default"},...
Schedule: */2 * * * *
Concurrency Policy: Allow
Suspend: False
Successful Job History Limit: 3
Failed Job History Limit: 1
Starting Deadline Seconds: <unset>
Selector: <unset>
Parallelism: <unset>
Completions: <unset>
Active Deadline Seconds: 100s
Pod Template:
Labels: <none>
Service Account: scheduled-autoscaler-service-account
Containers:
adwords-api-scale-up-container:
Image: bitnami/kubectl:1.15-debian-9
Port: <none>
Host Port: <none>
Command:
bash
Args:
-xc
kubectl scale --replicas=2 --v=7 deployment/adwords-api-deployment
Environment: <none>
Mounts:
/.kube/ from kubectl-config (ro)
Volumes:
kubectl-config:
Type: HostPath (bare host directory volume)
Path: $HOME/.kube
HostPathType:
Last Schedule Time: <unset>
Active Jobs: <none>
Events: <none>
In case of custom RBAC configuration i suggest referring to kubernetes documentation.
Hope this helps.

rabbitmq kubernetes with NFS mount

I tried to set up a rabbitmq cluster in a kubernetes envirnoment that has NFS PVs with the help of this tutorial. Unfortunately it seems like the rabbitmq wants to change the owner of /usr/lib/rabbitmq, but when I have a NFS directory mounted there, I get an error:
$ kubectl logs rabbitmq-0 -f
chown: /var/lib/rabbitmq: Operation not permitted
chown: /var/lib/rabbitmq: Operation not permitted
I guess I have two options: fork the rabbitmq and remove the chown and build my own images or make kubernetes/nfs work nicely. I would not like to make my own fork and getting kubernetes/nfs working nicely does not sound like it should be my problem. Any other ideas?
This is what i tried to reproduce this issue.
I was installed kubernetes cluster using kubeadm on redhat 7 and below is the cluster ,node details
ENVIRONMENT DETAILS:
[root#master tmp]# kubectl cluster-info
Kubernetes master is running at https://192.168.56.4:6443
KubeDNS is running at https://192.168.56.4:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
[root#master tmp]#
[root#master tmp]# kubectl get no
NAME STATUS ROLES AGE VERSION
master.k8s Ready master 8d v1.16.2
node1.k8s Ready <none> 7d22h v1.16.3
node2.k8s Ready <none> 7d21h v1.16.3
[root#master tmp]#
First i have set the nfs configuration on both master and worker nodes by running below steps on both master and worker nodes.here master node is nfs server and both worker nodes are nfs clients
NFS SETUP:
yum install nfs-utils nfs-utils-lib =============================================================>>>>> on nfs server,client
yum install portmap =============================================================>>>>> on nfs server,client
mkdir /nfsroot =============================>>>>>>>>>>>>>>>>>>on nfs server
[root#master ~]# cat /etc/exports =============================================================>>>>> on nfs server
/nfsroot 192.168.56.5/255.255.255.0(rw,sync,no_root_squash)
/nfsroot 192.168.56.6/255.255.255.0(rw,sync,no_root_squash)
exportfs -r =============================================================>>>>> on nfs server
service nfs start =============================================================>>>>> on nfs server,client
showmount -e =============================================================>>>>> on nfs server,client
Now nfs setup is ready and will apply rabbitmq k8s setup
RABBITMQ K8S SETUP:
First step is to create persistent volumes using the nfs mount which we created in above step
[root#master tmp]# cat /root/rabbitmq-pv.yaml
kind: PersistentVolume
apiVersion: v1
metadata:
name: rabbitmq-pv-1
spec:
accessModes:
- ReadWriteOnce
- ReadOnlyMany
nfs:
server: 192.168.56.4
path: /nfsroot
capacity:
storage: 1Mi
persistentVolumeReclaimPolicy: Recycle
---
kind: PersistentVolume
apiVersion: v1
metadata:
name: rabbitmq-pv-2
spec:
accessModes:
- ReadWriteOnce
- ReadOnlyMany
nfs:
server: 192.168.56.4
path: /nfsroot
capacity:
storage: 1Mi
persistentVolumeReclaimPolicy: Recycle
---
kind: PersistentVolume
apiVersion: v1
metadata:
name: rabbitmq-pv-3
spec:
accessModes:
- ReadWriteOnce
- ReadOnlyMany
nfs:
server: 192.168.56.4
path: /nfsroot
capacity:
storage: 1Mi
persistentVolumeReclaimPolicy: Recycle
---
kind: PersistentVolume
apiVersion: v1
metadata:
name: rabbitmq-pv-4
spec:
accessModes:
- ReadWriteOnce
- ReadOnlyMany
nfs:
server: 192.168.56.4
path: /nfsroot
capacity:
storage: 1Mi
persistentVolumeReclaimPolicy: Recycle
After applied the above manifest ,it created pv's as below
[root#master ~]# kubectl apply -f rabbitmq-pv.yaml
persistentvolume/rabbitmq-pv-1 created
persistentvolume/rabbitmq-pv-2 created
persistentvolume/rabbitmq-pv-3 created
persistentvolume/rabbitmq-pv-4 created
[root#master ~]# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
rabbitmq-pv-1 1Mi RWO,ROX Recycle Available 5s
rabbitmq-pv-2 1Mi RWO,ROX Recycle Available 5s
rabbitmq-pv-3 1Mi RWO,ROX Recycle Available 5s
rabbitmq-pv-4 1Mi RWO,ROX Recycle Available 5s
[root#master ~]#
No need to create persistentvolumeclaim ,since it will be automatically taken care while running statefulset manifest by volumeclaimtemplate option
now lets create the secret which you have mentioned as below
[root#master tmp]# kubectl create secret generic rabbitmq-config --from-literal=erlang-cookie=c-is-for-cookie-thats-good-enough-for-me
secret/rabbitmq-config created
[root#master tmp]#
[root#master tmp]# kubectl get secrets
NAME TYPE DATA AGE
default-token-vjsmd kubernetes.io/service-account-token 3 8d
jp-token-cfdzx kubernetes.io/service-account-token 3 5d2h
rabbitmq-config Opaque 1 39m
[root#master tmp]#
Now let submit your rabbitmq manifest by make changes of replacing all loadbalancer service type to nodeport service,since we are not using any cloudprovider environment.Also replace the volume names to rabbitmq-pv,which we have created in pv step.reduced the size from 1Gi to 1Mi,since it is just testing demo
apiVersion: v1
kind: Service
metadata:
# Expose the management HTTP port on each node
name: rabbitmq-management
labels:
app: rabbitmq
spec:
ports:
- port: 15672
name: http
selector:
app: rabbitmq
sessionAffinity: ClientIP
type: NodePort
---
apiVersion: v1
kind: Service
metadata:
# The required headless service for StatefulSets
name: rabbitmq
labels:
app: rabbitmq
spec:
ports:
- port: 5672
name: amqp
- port: 4369
name: epmd
- port: 25672
name: rabbitmq-dist
clusterIP: None
selector:
app: rabbitmq
---
apiVersion: v1
kind: Service
metadata:
# The required headless service for StatefulSets
name: rabbitmq-cluster
labels:
app: rabbitmq
spec:
ports:
- port: 5672
name: amqp
- port: 4369
name: epmd
- port: 25672
name: rabbitmq-dist
type: NodePort
selector:
app: rabbitmq
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: rabbitmq
spec:
serviceName: "rabbitmq"
selector:
matchLabels:
app: rabbitmq
replicas: 4
template:
metadata:
labels:
app: rabbitmq
spec:
terminationGracePeriodSeconds: 10
containers:
- name: rabbitmq
image: rabbitmq:3.6.6-management-alpine
lifecycle:
postStart:
exec:
command:
- /bin/sh
- -c
- >
if [ -z "$(grep rabbitmq /etc/resolv.conf)" ]; then
sed "s/^search \([^ ]\+\)/search rabbitmq.\1 \1/" /etc/resolv.conf > /etc/resolv.conf.new;
cat /etc/resolv.conf.new > /etc/resolv.conf;
rm /etc/resolv.conf.new;
fi;
until rabbitmqctl node_health_check; do sleep 1; done;
if [[ "$HOSTNAME" != "rabbitmq-0" && -z "$(rabbitmqctl cluster_status | grep rabbitmq-0)" ]]; then
rabbitmqctl stop_app;
rabbitmqctl join_cluster rabbit#rabbitmq-0;
rabbitmqctl start_app;
fi;
rabbitmqctl set_policy ha-all "." '{"ha-mode":"exactly","ha-params":3,"ha-sync-mode":"automatic"}'
env:
- name: RABBITMQ_ERLANG_COOKIE
valueFrom:
secretKeyRef:
name: rabbitmq-config
key: erlang-cookie
ports:
- containerPort: 5672
name: amqp
- containerPort: 25672
name: rabbitmq-dist
volumeMounts:
- name: rabbitmq-pv
mountPath: /var/lib/rabbitmq
volumeClaimTemplates:
- metadata:
name: rabbitmq-pv
annotations:
volume.alpha.kubernetes.io/storage-class: default
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Mi # make this bigger in production
After submitted the pod manifest,able to see statefulsets,pods are created
[root#master tmp]# kubectl apply -f rabbitmq.yaml
service/rabbitmq-management created
service/rabbitmq created
service/rabbitmq-cluster created
statefulset.apps/rabbitmq created
[root#master tmp]#
NAME READY STATUS RESTARTS AGE
rabbitmq-0 1/1 Running 0 18m
rabbitmq-1 1/1 Running 0 17m
rabbitmq-2 1/1 Running 0 13m
rabbitmq-3 1/1 Running 0 13m
[root#master ~]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
rabbitmq-pv-rabbitmq-0 Bound rabbitmq-pv-1 1Mi RWO,ROX 49m
rabbitmq-pv-rabbitmq-1 Bound rabbitmq-pv-3 1Mi RWO,ROX 48m
rabbitmq-pv-rabbitmq-2 Bound rabbitmq-pv-2 1Mi RWO,ROX 44m
rabbitmq-pv-rabbitmq-3 Bound rabbitmq-pv-4 1Mi RWO,ROX 43m
[root#master ~]# kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
rabbitmq ClusterIP None <none> 5672/TCP,4369/TCP,25672/TCP 49m
rabbitmq-cluster NodePort 10.102.250.172 <none> 5672:30574/TCP,4369:31757/TCP,25672:31854/TCP 49m
rabbitmq-management NodePort 10.108.131.46 <none> 15672:31716/TCP 49m
[root#master ~]#
Now i tried to hit the rabbitmq management page using nodeport service by http://192.168.56.6://31716 and able to get the login page
So please let me know if you still face chown issue after you tried like above,so that we can see further by checking podsecuritypolicies applied or not

No resources found when installing kubernetes dashboard

I am install kubernetes dashboard using this command:
[root#iZuf63refzweg1d9dh94t8Z ~]# kubectl create -f kubernetes-dashboard.yaml
secret/kubernetes-dashboard-certs created
serviceaccount/kubernetes-dashboard created
role.rbac.authorization.k8s.io/kubernetes-dashboard-minimal created
rolebinding.rbac.authorization.k8s.io/kubernetes-dashboard-minimal created
deployment.apps/kubernetes-dashboard created
service/kubernetes-dashboard created
this is my kubernetes yaml config:
kind: Deployment
apiVersion: apps/v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kube-system
spec:
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
k8s-app: kubernetes-dashboard
template:
metadata:
labels:
k8s-app: kubernetes-dashboard
spec:
containers:
- name: kubernetes-dashboard
image: registry.cn-beijing.aliyuncs.com/minminmsn/kubernetes-dashboard:v1.10.1
ports:
- containerPort: 8443
protocol: TCP
args:
- --auto-generate-certificates
# Uncomment the following line to manually specify Kubernetes API server Host
# If not specified, Dashboard will attempt to auto discover the API server and connect
# to it. Uncomment only if the default does not work.
# - --apiserver-host=http://my-address:port
volumeMounts:
- name: kubernetes-dashboard-certs
mountPath: /certs
# Create on-disk volume to store exec logs
- mountPath: /tmp
name: tmp-volume
livenessProbe:
httpGet:
scheme: HTTPS
path: /
port: 8443
initialDelaySeconds: 30
timeoutSeconds: 30
volumes:
- name: kubernetes-dashboard-certs
secret:
secretName: kubernetes-dashboard-certs
- name: tmp-volume
emptyDir: {}
serviceAccountName: kubernetes-dashboard
# Comment the following tolerations if Dashboard must not be deployed on master
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
---
# ------------------- Dashboard Service ------------------- #
kind: Service
apiVersion: v1
metadata:
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kube-system
spec:
type: NodePort
ports:
- port: 443
targetPort: 8443
selector:
k8s-app: kubernetes-dashboard
get the result:
[root#iZuf63refzweg1d9dh94t8Z ~]# kubectl get svc -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kube-dns ClusterIP 10.43.0.10 <none> 53/UDP,53/TCP 102d
kubernetes-dashboard NodePort 10.254.180.117 <none> 443:31720/TCP 58s
metrics-server ClusterIP 10.43.96.112 <none> 443/TCP 102d
[root#iZuf63refzweg1d9dh94t8Z ~]# kubectl get pods -n kube-system
No resources found.
but when I check the port 31720:
lsof -i:31720
the output is empty.Is the service deploy success? How to check the deploy log? Why the port not binding success?
it is under its own namespace - "kubernetes-dashboard". So, just use kubectl get all -n kubernetes-dashboard to see everything