I'm new to kubernetes, and I'm trying to create a cluster. But after I configure the master with the kubeadm command, I see there are some errors with the pods, and this results in a master that is always in a NotReady state.
All seems to originate from the fact that kube-proxy cannot list the endpoints and the services... and for this reason (or so I understand) cannot update the iptables.
Here is my kubectl version:
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.1", GitCommit:"b7394102d6ef778017f2ca4046abbaa23b88c290", GitTreeState:"clean", BuildDate:"2019-04-08T17:11:31Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.1", GitCommit:"b7394102d6ef778017f2ca4046abbaa23b88c290", GitTreeState:"clean", BuildDate:"2019-04-08T17:02:58Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
And here are the logs from the kube-proxy pod:
$ kubectl logs -n kube-system kube-proxy-xjxck
W0430 12:33:28.887260 1 server_others.go:267] Flag proxy-mode="" unknown, assuming iptables proxy
W0430 12:33:28.913671 1 node.go:113] Failed to retrieve node info: Unauthorized
I0430 12:33:28.915780 1 server_others.go:147] Using iptables Proxier.
W0430 12:33:28.916065 1 proxier.go:314] invalid nodeIP, initializing kube-proxy with 127.0.0.1 as nodeIP
W0430 12:33:28.916089 1 proxier.go:319] clusterCIDR not specified, unable to distinguish between internal and external traffic
I0430 12:33:28.917555 1 server.go:555] Version: v1.14.1
I0430 12:33:28.959345 1 conntrack.go:52] Setting nf_conntrack_max to 131072
I0430 12:33:28.960392 1 config.go:202] Starting service config controller
I0430 12:33:28.960444 1 controller_utils.go:1027] Waiting for caches to sync for service config controller
I0430 12:33:28.960572 1 config.go:102] Starting endpoints config controller
I0430 12:33:28.960609 1 controller_utils.go:1027] Waiting for caches to sync for endpoints config controller
E0430 12:33:28.970720 1 event.go:191] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"fh-ubuntu01.159a40901fa85264", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"Node", Namespace:"", Name:"fh-ubuntu01", UID:"fh-ubuntu01", APIVersion:"", ResourceVersion:"", FieldPath:""}, Reason:"Starting", Message:"Starting kube-proxy.", Source:v1.EventSource{Component:"kube-proxy", Host:"fh-ubuntu01"}, FirstTimestamp:v1.Time{Time:time.Time{wall:0xbf2a2e0639406264, ext:334442672, loc:(*time.Location)(0x2703080)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xbf2a2e0639406264, ext:334442672, loc:(*time.Location)(0x2703080)}}, Count:1, Type:"Normal", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'Unauthorized' (will not retry!)
E0430 12:33:28.970939 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Unauthorized
E0430 12:33:28.971106 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Unauthorized
E0430 12:33:29.977038 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Unauthorized
E0430 12:33:29.979890 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Service: Unauthorized
E0430 12:33:30.980098 1 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1.Endpoints: Unauthorized
now, I've created a new ClusterRoleBinding this way:
$ kubectl create clusterrolebinding kube-proxy-binding --clusterrole=system:node-proxier --user=system:kube-proxy
and if I describe the ClusterRole, I can see this:
$ kubectl describe clusterrole system:node-proxier
Name: system:node-proxier
Labels: kubernetes.io/bootstrapping=rbac-defaults
Annotations: rbac.authorization.kubernetes.io/autoupdate: true
PolicyRule:
Resources Non-Resource URLs Resource Names Verbs
--------- ----------------- -------------- -----
events [] [] [create patch update]
nodes [] [] [get]
endpoints [] [] [list watch]
services [] [] [list watch]
so the user "system:kube-proxy" should be able to list the endpoints and the services, right? Now, if I print the YAML file of the kube-proxy daemonSet, I get his:
$ kubectl get configmap kube-proxy -n kube-system -o yaml
apiVersion: v1
data:
config.conf: |-
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
clientConnection:
acceptContentTypes: ""
burst: 10
contentType: application/vnd.kubernetes.protobuf
kubeconfig: /var/lib/kube-proxy/kubeconfig.conf
qps: 5
clusterCIDR: ""
configSyncPeriod: 15m0s
conntrack:
max: null
maxPerCore: 32768
min: 131072
tcpCloseWaitTimeout: 1h0m0s
tcpEstablishedTimeout: 24h0m0s
enableProfiling: false
healthzBindAddress: 0.0.0.0:10256
hostnameOverride: ""
iptables:
masqueradeAll: false
masqueradeBit: 14
minSyncPeriod: 0s
syncPeriod: 30s
ipvs:
excludeCIDRs: null
minSyncPeriod: 0s
scheduler: ""
syncPeriod: 30s
kind: KubeProxyConfiguration
metricsBindAddress: 127.0.0.1:10249
mode: ""
nodePortAddresses: null
oomScoreAdj: -999
portRange: ""
resourceContainer: /kube-proxy
udpIdleTimeout: 250ms
winkernel:
enableDSR: false
networkName: ""
sourceVip: ""
kubeconfig.conf: |-
apiVersion: v1
kind: Config
clusters:
- cluster:
certificate-authority:
/var/run/secrets/kubernetes.io/serviceaccount/ca.crt
server: https://10.0.1.1:6443
name: default
contexts:
- context:
cluster: default
namespace: default
user: default
name: default
current-context: default
users:
- name: default
user:
tokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
kind: ConfigMap
metadata:
creationTimestamp: "2019-03-21T10:34:03Z"
labels:
app: kube-proxy
name: kube-proxy
namespace: kube-system
resourceVersion: "4458115"
selfLink: /api/v1/namespaces/kube-system/configmaps/kube-proxy
uid: d8a454fb-4bc4-11e9-b0b4-00155d044109
I can see that "user: default" that confuses me... which user is it trying to authenticate with? is there an actual user named "default"?
thank you very much!
output from kubectl get po -n kube-system
$ kubectl get po - n kube-system
NAME READY STATUS RESTARTS AGE
coredns-fb8b8dccf-27qck 0/1 Pending 0 7d15h
coredns-fb8b8dccf-dd6bh 0/1 Pending 0 7d15h
kube-apiserver-fh-ubuntu01 1/1 Running 1 7d15h
kube-controller-manager-fh-ubuntu01 1/1 Running 0 7d15h
kube-proxy-xjxck 1/1 Running 0 43h
kube-scheduler-fh-ubuntu01 1/1 Running 1 7d15h
weave-net-psqh5 1/2 CrashLoopBackOff 2144 7d15h
cluster health look healthy
$ kubectl get cs
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-2 Healthy {"health": "true"}
etcd-3 Healthy {"health": "true"}
etcd-0 Healthy {"health": "true"}
etcd-1 Healthy {"health": "true"}
Run below command to check cluster health
kubectl get cs
Then check status of control plane services
kubectl get po -n kube-system
Issue seems to be with weave-net-psqh5 pod. find out why it was getting into CrashLoop status.
share the logs from weave-net-psqh5.
Related
I try to expose service in k8s.
When I launch kubectl expose command kubectl expose deployment demo-api-app -n default --type=NodePort --name=my-service everything is OK.
I have endpoints and a port to nodeport:
[root#mys-servver test_connect]# kubectl get endpoints my-service
NAME ENDPOINTS AGE
my-service 10.233.93.1:8081 4m31s
root#ppfxasmsnd1005 test_connect]# kubectl get svc my-service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
my-service NodePort 10.233.11.26 <none> 8081:31594/TCP 5m12s
[root#ppfxasmsnd1005 test_connect]#
Nodeport have to listen to 31594 but noting is listen ( I try the following comand in all worker/control plane)
[root#ys-servver test_connect]# netstat -tulpn | grep 31594
[root#ys-servver test_connect]#
I have nothing in kubeproxy logs. ( I try to increase log but doesn't work )
[root#ys-servver test_connect]# kubectl logs -n kube-system -l "k8s-app=kube-proxy"
I0216 08:15:40.569350 1 conntrack.go:52] "Setting nf_conntrack_max" nf_conntrack_max=524288
I0216 08:15:40.569546 1 config.go:317] "Starting service config controller"
I0216 08:15:40.569556 1 shared_informer.go:255] Waiting for caches to sync for service config
I0216 08:15:40.569599 1 config.go:226] "Starting endpoint slice config controller"
I0216 08:15:40.569614 1 shared_informer.go:255] Waiting for caches to sync for endpoint slice config
I0216 08:15:40.569614 1 config.go:444] "Starting node config controller"
I0216 08:15:40.569623 1 shared_informer.go:255] Waiting for caches to sync for node config
I0216 08:15:40.670408 1 shared_informer.go:262] Caches are synced for service config
I0216 08:15:40.670446 1 shared_informer.go:262] Caches are synced for endpoint slice config
I0216 08:15:40.670476 1 shared_informer.go:262] Caches are synced for node config
I0216 08:15:40.721944 1 conntrack.go:52] "Setting nf_conntrack_max" nf_conntrack_max=131072
I0216 08:15:40.722143 1 config.go:226] "Starting endpoint slice config controller"
...
I deploy with kubespray and I use iptables to kubeproxy
system: Redhat8 selinus disabled on all nodes
What's wrong
Ps:
[root#servver test_connect]# kubectl version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.0", GitCommit:"ab69524f795c42094a6630298ff53f3c3ebab7f4", GitTreeState:"clean", BuildDate:"2021-12-07T18:16:20Z", GoVersion:"go1.17.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.5", GitCommit:"804d6167111f6858541cef440ccc53887fbbc96a", GitTreeState:"clean", BuildDate:"2022-12-08T10:08:09Z", GoVersion:"go1.19.4", Compiler:"gc", Platform:"linux/amd64"}
WARNING: version difference between client (1.23) and server (1.25) exceeds the supported minor version skew of +/-1
[root#servver test_connect]# kubectl cluster-info
Kubernetes control plane is running at https://127.0.0.1:6443
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
[root#servver test_connect]#
[root#servver test_connect]# sysctl -p
net.ipv4.ip_forward = 1
kernel.keys.root_maxbytes = 25000000
kernel.keys.root_maxkeys = 1000000
kernel.panic = 10
kernel.panic_on_oops = 1
vm.overcommit_memory = 1
vm.panic_on_oom = 0
net.ipv4.ip_local_reserved_ports = 30000-32767
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
[root#servver test_connect]#
I must nodeport open to all node
I am new to kubernetes and was trying to apply horizontal pod autoscaling to my existing application. and after following other stackoverflow details - got to know that I need to install metric-server - and I was able to - but some how it's not working and unable to handle request.
Further I followed few more things but unable to resolve the issue - I will really appreciate any help here.
Please let me know for any further details you need for helping me :) Thanks in advance.
Steps followed:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
serviceaccount/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
service/metrics-server created
deployment.apps/metrics-server created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
kubectl get deploy,svc -n kube-system | egrep metrics-server
deployment.apps/metrics-server 1/1 1 1 2m6s
service/metrics-server ClusterIP 10.32.0.32 <none> 443/TCP 2m6s
kubectl get pods -n kube-system | grep metrics-server
metrics-server-64cf6869bd-6gx88 1/1 Running 0 2m39s
vi ana_hpa.yaml
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: ana-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: StatefulSet
name: common-services-auth
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 80
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 160
k apply -f ana_hpa.yaml
horizontalpodautoscaler.autoscaling/ana-hpa created
k get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
ana-hpa StatefulSet/common-services-auth <unknown>/160%, <unknown>/80% 1 10 0 4s
k describe hpa ana-hpa
Name: ana-hpa
Namespace: default
Labels: <none>
Annotations: <none>
CreationTimestamp: Tue, 12 Apr 2022 17:01:25 +0530
Reference: StatefulSet/common-services-auth
Metrics: ( current / target )
resource memory on pods (as a percentage of request): <unknown> / 160%
resource cpu on pods (as a percentage of request): <unknown> / 80%
Min replicas: 1
Max replicas: 10
StatefulSet pods: 3 current / 0 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededGetScale the HPA controller was able to get the target's current scale
ScalingActive False FailedGetResourceMetric the HPA was unable to compute the replica count: failed to get memory utilization: unable to get metrics for resource memory: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetResourceMetric 38s (x8 over 2m23s) horizontal-pod-autoscaler failed to get cpu utilization: unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
Warning FailedComputeMetricsReplicas 38s (x8 over 2m23s) horizontal-pod-autoscaler invalid metrics (2 invalid out of 2), first error is: failed to get memory utilization: unable to get metrics for resource memory: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
Warning FailedGetResourceMetric 23s (x9 over 2m23s) horizontal-pod-autoscaler failed to get memory utilization: unable to get metrics for resource memory: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
kubectl get --raw /apis/metrics.k8s.io/v1beta1
Error from server (ServiceUnavailable): the server is currently unable to handle the request
kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes"
Error from server (ServiceUnavailable): the server is currently unable to handle the request
kubectl edit deployments.apps -n kube-system metrics-server
Add hostNetwork: true
deployment.apps/metrics-server edited
kubectl get pods -n kube-system | grep metrics-server
metrics-server-5dc6dbdb8-42hw9 1/1 Running 0 10m
k describe pod metrics-server-5dc6dbdb8-42hw9 -n kube-system
Name: metrics-server-5dc6dbdb8-42hw9
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Node: pusntyn196.apac.avaya.com/10.133.85.196
Start Time: Tue, 12 Apr 2022 17:08:25 +0530
Labels: k8s-app=metrics-server
pod-template-hash=5dc6dbdb8
Annotations: <none>
Status: Running
IP: 10.133.85.196
IPs:
IP: 10.133.85.196
Controlled By: ReplicaSet/metrics-server-5dc6dbdb8
Containers:
metrics-server:
Container ID: containerd://024afb1998dce4c0bd5f4e58f996068ea37982bd501b54fda2ef8d5c1098b4f4
Image: k8s.gcr.io/metrics-server/metrics-server:v0.6.1
Image ID: k8s.gcr.io/metrics-server/metrics-server#sha256:5ddc6458eb95f5c70bd13fdab90cbd7d6ad1066e5b528ad1dcb28b76c5fb2f00
Port: 4443/TCP
Host Port: 4443/TCP
Args:
--cert-dir=/tmp
--secure-port=4443
--kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
--kubelet-use-node-status-port
--metric-resolution=15s
State: Running
Started: Tue, 12 Apr 2022 17:08:26 +0530
Ready: True
Restart Count: 0
Requests:
cpu: 100m
memory: 200Mi
Liveness: http-get https://:https/livez delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get https://:https/readyz delay=20s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/tmp from tmp-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-g6p4g (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
tmp-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
kube-api-access-g6p4g:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 2s
node.kubernetes.io/unreachable:NoExecute op=Exists for 2s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m31s default-scheduler Successfully assigned kube-system/metrics-server-5dc6dbdb8-42hw9 to pusntyn196.apac.avaya.com
Normal Pulled 2m32s kubelet Container image "k8s.gcr.io/metrics-server/metrics-server:v0.6.1" already present on machine
Normal Created 2m31s kubelet Created container metrics-server
Normal Started 2m31s kubelet Started container metrics-server
kubectl get --raw /apis/metrics.k8s.io/v1beta1
Error from server (ServiceUnavailable): the server is currently unable to handle the request
kubectl get pods -n kube-system | grep metrics-server
metrics-server-5dc6dbdb8-42hw9 1/1 Running 0 10m
kubectl logs -f metrics-server-5dc6dbdb8-42hw9 -n kube-system
E0412 11:43:54.684784 1 configmap_cafile_content.go:242] kube-system/extension-apiserver-authentication failed with : missing content for CA bundle "client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
E0412 11:44:27.001010 1 configmap_cafile_content.go:242] key failed with : missing content for CA bundle "client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
k logs -f metrics-server-5dc6dbdb8-42hw9 -n kube-system
I0412 11:38:26.447305 1 serving.go:342] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
I0412 11:38:26.899459 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0412 11:38:26.899477 1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I0412 11:38:26.899518 1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
I0412 11:38:26.899545 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0412 11:38:26.899546 1 configmap_cafile_content.go:201] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file"
I0412 11:38:26.899567 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0412 11:38:26.900480 1 dynamic_serving_content.go:131] "Starting controller" name="serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key"
I0412 11:38:26.900811 1 secure_serving.go:266] Serving securely on [::]:4443
I0412 11:38:26.900854 1 tlsconfig.go:240] "Starting DynamicServingCertificateController"
W0412 11:38:26.900965 1 shared_informer.go:372] The sharedIndexInformer has started, run more than once is not allowed
I0412 11:38:26.999960 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0412 11:38:26.999989 1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController
I0412 11:38:26.999970 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
E0412 11:38:27.000087 1 configmap_cafile_content.go:242] kube-system/extension-apiserver-authentication failed with : missing content for CA bundle "client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
E0412 11:38:27.000118 1 configmap_cafile_content.go:242] key failed with : missing content for CA bundle "client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file"
kubectl top nodes
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)
kubectl top pods
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods.metrics.k8s.io)
Edit metrics server deployment yaml
Add - --kubelet-insecure-tls
k apply -f metric-server-deployment.yaml
serviceaccount/metrics-server unchanged
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader unchanged
clusterrole.rbac.authorization.k8s.io/system:metrics-server unchanged
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader unchanged
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator unchanged
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server unchanged
service/metrics-server unchanged
deployment.apps/metrics-server configured
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io unchanged
kubectl get pods -n kube-system | grep metrics-server
metrics-server-5dc6dbdb8-42hw9 1/1 Running 0 10m
kubectl top pods
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods.metrics.k8s.io)
Also tried by adding below to metrics server deployment
command:
- /metrics-server
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
This can easily be resolved by editing the deployment yaml files and adding the hostNetwork: true after the dnsPolicy: ClusterFirst
kubectl edit deployments.apps -n kube-system metrics-server
insert:
hostNetwork: true
I hope this help somebody for bare metal cluster:
$ helm --repo https://kubernetes-sigs.github.io/metrics-server/ --kubeconfig=$HOME/.kube/loc-cluster.config -n kube-system --set args='{--kubelet-insecure-tls}' upgrade --install metrics-server metrics-server
$ helm --kubeconfig=$HOME/.kube/loc-cluster.config -n kube-system uninstall metrics-server
Update: I deployed the metrics-server using the same command. Perhaps you can start fresh by removing existing resources and running:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
=======================================================================
It appears the --kubelet-insecure-tls flag was not configured correctly for the pod template in the deployment. The following should fix this:
Edit the existing deployment in the cluster with kubectl edit deployment/metrics-server -nkube-system.
Add the flag to the spec.containers[].args list, so that the deployment looks like this:
...
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
- --kubelet-insecure-tls <=======ADD IT HERE.
image: k8s.gcr.io/metrics-server/metrics-server:v0.6.1
...
Simply save your changes and let the deployment rollout the updated pods. You can use watch -n1 kubectl get deployment/kube-metrics -nkube-system and wait for UP-TO-DATE column to show 1.
Like this:
NAME READY UP-TO-DATE AVAILABLE AGE
metrics-server 1/1 1 1 16m
Verify with kubectl top nodes. It will show something like
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
docker-desktop 222m 5% 1600Mi 41%
I've just verified this to work on a local setup. Let me know if this helps :)
Please configuration aggregation layer correctly and carefully, you can use this link for help : https://kubernetes.io/docs/tasks/extend-kubernetes/configure-aggregation-layer/.
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
name: <name of the registration object>
spec:
group: <API group name this extension apiserver hosts>
version: <API version this extension apiserver hosts>
groupPriorityMinimum: <priority this APIService for this group, see API documentation>
versionPriority: <prioritizes ordering of this version within a group, see API documentation>
service:
namespace: <namespace of the extension apiserver service>
name: <name of the extension apiserver service>
caBundle: <pem encoded ca cert that signs the server cert used by the webhook>
It would be helpful to provide kubectl version return value.
For me on EKS with helmfile I had to write in the values.yaml using the metrics-server chart :
containerPort: 10250
The value was enforced by default to 4443 for an unknown reason when I first deployed the chart.
See doc:
https://github.com/kubernetes-sigs/metrics-server/blob/master/charts/metrics-server/values.yaml#L62
https://aws.amazon.com/premiumsupport/knowledge-center/eks-metrics-server/#:~:text=confirm%20that%20your%20security%20groups
Then kubectl top nodes and kubectl describe apiservice v1beta1.metrics.k8s.io were working.
First of all, execute the following command:
kubectl get apiservices
And checkout the availablity (status) of kube-system/metrics-server service.
In case the availability is True:
Add hostNetwork: true to the spec of your metrics-server deployment by executing the following command:
kubectl edit deployment -n kube-system metrics-server
It should look like the following:
...
spec:
hostNetwork: true
...
Setting hostNetwork to true means that Pod will have access to
the host where it's running.
In case the availability is False (MissingEndpoints):
Download metrics-server:
wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.5.0/components.yaml
Remove (legacy) metrics server:
kubectl delete -f components.yaml
Edit downloaded file and add - --kubelet-insecure-tls to args list:
...
labels:
k8s-app: metrics-server
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
- --kubelet-insecure-tls # add this line
...
Create service once again:
kubectl apply -f components.yaml
My configuration is 3 master nodes:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
srv-dev-k8s-master-01 Ready control-plane,master 101d v1.22.2
srv-dev-k8s-master-02 Ready control-plane,master 101d v1.22.2
srv-dev-k8s-master-03 Ready control-plane,master 101d v1.22.2
srv-dev-k8s-worker-01 Ready <none> 101d v1.22.2
srv-dev-k8s-worker-02 Ready <none> 101d v1.22.2
srv-dev-k8s-worker-03 Ready <none> 101d v1.22.2
srv-dev-k8s-worker-04 Ready <none> 101d v1.22.2
srv-dev-k8s-worker-05 Ready <none> 101d v1.22.2
srv-dev-k8s-worker-06 Ready <none> 101d v1.22.2
srv-dev-k8s-worker-07 Ready <none> 101d v1.22.2
srv-dev-k8s-worker-08 Ready <none> 101d v1.22.2
srv-dev-k8s-worker-09 Ready <none> 75d v1.22.2
When I try to check the logs I get an error:
$ kubectl logs kube-apiserver-srv-dev-k8s-master-01
error: You must be logged in to the server (the server has asked for the client to provide credentials ( pods/log kube-apiserver-srv-dev-k8s-master-01))
But I can access other master nodes:
$ kubectl logs kube-apiserver-srv-dev-k8s-master-02
I1121 19:51:52.741596 1 server.go:553] external host was not specified, using 172.24.27.14
I1121 19:51:52.749073 1 server.go:161] Version: v1.22.2
I1121 19:51:53.179532 1 plugins.go:158] Loaded 13 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolerationSeconds,PodTolerationRestriction,DefaultStorageClass,StorageObjectInUseProtection,RuntimeClass,DefaultIngressClass,MutatingAdmissionWebhook.
I1121 19:51:53.179548 1 plugins.go:161] Loaded 12 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,PodSecurity,Priority,PodTolerationRestriction,PersistentVolumeClaimResize,RuntimeClass,CertificateApproval,CertificateSigning,CertificateSubjectRestriction,ValidatingAdmissionWebhook,ResourceQuota.
I1121 19:51:53.180881 1 shared_informer.go:240] Waiting for caches to sync for node_authorizer
I1121 19:51:53.185693 1 plugins.go:158] Loaded 13 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolerationSeconds,PodTolerationRestriction,DefaultStorageClass,StorageObjectInUseProtection,RuntimeClass,DefaultIngressClass,MutatingAdmissionWebhook.
I1121 19:51:53.185807 1 plugins.go:161] Loaded 12 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,PodSecurity,Priority,PodTolerationRestriction,PersistentVolumeClaimResize,RuntimeClass,CertificateApproval,CertificateSigning,CertificateSubjectRestriction,ValidatingAdmissionWebhook,ResourceQuota.
...
Please tell me what could be the error and how to fix this error? Thank you.
Additional information:
I compared /etc/kubernetes/kubelet.conf configurations on all master nodes and this is what I saw:
Master1:
$ sudo cat /etc/kubernetes/kubelet.conf
apiVersion: v1
clusters:
- cluster:
certificate-authority-data:
server: https://172.24.16.6:6443
name: kubernetes
contexts:
- context:
cluster: kubernetes
user: system:node:srv-dev-k8s-master-01
name: system:node:srv-dev-k8s-master-01#kubernetes
current-context: system:node:srv-dev-k8s-master-01#kubernetes
kind: Config
preferences: {}
users:
- name: system:node:srv-dev-k8s-master-01
user:
client-certificate: /var/lib/kubelet/pki/kubelet-client-current.pem
client-key: /var/lib/kubelet/pki/kubelet-client-current.pem
Master2 and Master3:
$ sudo cat /etc/kubernetes/kubelet.conf
apiVersion: v1
clusters:
- cluster:
certificate-authority-data:
server: https://172.24.16.6:6443
name: default-cluster
contexts:
- context:
cluster: default-cluster
namespace: default
user: default-auth
name: default-context
current-context: default-context
kind: Config
preferences: {}
users:
- name: default-auth
user:
client-certificate: /var/lib/kubelet/pki/kubelet-client-current.pem
client-key: /var/lib/kubelet/pki/kubelet-client-current.pem
kubectl config current-context on Master1, Master2, Master3:
$ kubectl config current-context
kubernetes-admin#kubernetes
Based on this information, I don't understand what to do next.
I'm new in kubernetes world, so forgive me if i'm writing mistake. I'm trying to deploy kubernetes dashboard
My cluster is containing three masters and 3 workers drained and not schedulable in order to install dashboard to masters nodes :
[root#pp-tmp-test20 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
pp-tmp-test20 Ready master 2d2h v1.15.2
pp-tmp-test21 Ready master 37h v1.15.2
pp-tmp-test22 Ready master 37h v1.15.2
pp-tmp-test23 Ready,SchedulingDisabled worker 36h v1.15.2
pp-tmp-test24 Ready,SchedulingDisabled worker 36h v1.15.2
pp-tmp-test25 Ready,SchedulingDisabled worker 36h v1.15.2
I'm trying to deploy kubernetes dashboard via this url :
[root#pp-tmp-test20 ~]# kubectl create -f https://raw.githubusercontent.com/kubernetes/dashboard/v1.10.1/src/deploy/recommended/kubernetes-dashboard.yaml
After this, a pod kubernetes-dashboard-5698d5bc9-ql6q8 is scheduled on my master node pp-tmp-test20/172.31.68.220
the pod
kube-system kubernetes-dashboard-5698d5bc9-ql6q8 /1 Running 1 7m11s 10.244.0.7 pp-tmp-test20 <none> <none>
the pod's logs
[root#pp-tmp-test20 ~]# kubectl logs kubernetes-dashboard-5698d5bc9-ql6q8 -n kube-system
2019/08/14 10:14:57 Starting overwatch
2019/08/14 10:14:57 Using in-cluster config to connect to apiserver
2019/08/14 10:14:57 Using service account token for csrf signing
2019/08/14 10:14:58 Successful initial request to the apiserver, version: v1.15.2
2019/08/14 10:14:58 Generating JWE encryption key
2019/08/14 10:14:58 New synchronizer has been registered: kubernetes-dashboard-key-holder-kube-system. Starting
2019/08/14 10:14:58 Starting secret synchronizer for kubernetes-dashboard-key-holder in namespace kube-system
2019/08/14 10:14:59 Initializing JWE encryption key from synchronized object
2019/08/14 10:14:59 Creating in-cluster Heapster client
2019/08/14 10:14:59 Metric client health check failed: the server could not find the requested resource (get services heapster). Retrying in 30 seconds.
2019/08/14 10:14:59 Auto-generating certificates
2019/08/14 10:14:59 Successfully created certificates
2019/08/14 10:14:59 Serving securely on HTTPS port: 8443
2019/08/14 10:15:29 Metric client health check failed: the server could not find the requested resource (get services heapster). Retrying in 30 seconds.
2019/08/14 10:15:59 Metric client health check failed: the server could not find the requested resource (get services heapster). Retrying in 30 seconds.
the describe of the pod
[root#pp-tmp-test20 ~]# kubectl describe pob kubernetes-dashboard-5698d5bc9-ql6q8 -n kube-system
Name: kubernetes-dashboard-5698d5bc9-ql6q8
Namespace: kube-system
Priority: 0
Node: pp-tmp-test20/172.31.68.220
Start Time: Wed, 14 Aug 2019 16:58:39 +0200
Labels: k8s-app=kubernetes-dashboard
pod-template-hash=5698d5bc9
Annotations: <none>
Status: Running
IP: 10.244.0.7
Controlled By: ReplicaSet/kubernetes-dashboard-5698d5bc9
Containers:
kubernetes-dashboard:
Container ID: docker://40edddf7a9102d15e3b22f4bc6f08b3a07a19e4841f09360daefbce0486baf0e
Image: k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1
Image ID: docker-pullable://k8s.gcr.io/kubernetes-dashboard-amd64#sha256:0ae6b69432e78069c5ce2bcde0fe409c5c4d6f0f4d9cd50a17974fea38898747
Port: 8443/TCP
Host Port: 0/TCP
Args:
--auto-generate-certificates
State: Running
Started: Wed, 14 Aug 2019 16:58:43 +0200
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Wed, 14 Aug 2019 16:58:41 +0200
Finished: Wed, 14 Aug 2019 16:58:42 +0200
Ready: True
Restart Count: 1
Liveness: http-get https://:8443/ delay=30s timeout=30s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/certs from kubernetes-dashboard-certs (rw)
/tmp from tmp-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kubernetes-dashboard-token-ptw78 (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kubernetes-dashboard-certs:
Type: Secret (a volume populated by a Secret)
SecretName: kubernetes-dashboard-certs
Optional: false
tmp-volume:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
kubernetes-dashboard-token-ptw78:
Type: Secret (a volume populated by a Secret)
SecretName: kubernetes-dashboard-token-ptw78
Optional: false
QoS Class: BestEffort
Node-Selectors: dashboard=true
Tolerations: node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m41s default-scheduler Successfully assigned kube-system/kubernetes-dashboard-5698d5bc9-ql6q8 to pp-tmp-test20.tec.prj.in.phm.education.gouv.fr
Normal Pulled 2m38s (x2 over 2m40s) kubelet, pp-tmp-test20 Container image "k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.1" already present on machine
Normal Created 2m37s (x2 over 2m39s) kubelet, pp-tmp-test20 Created container kubernetes-dashboard
Normal Started 2m37s (x2 over 2m39s) kubelet, pp-tmp-test20 Started container kubernetes-dashboard
the describe of the dashboard service
[root#pp-tmp-test20 ~]# kubectl describe svc/kubernetes-dashboard -n kube-system
Name: kubernetes-dashboard
Namespace: kube-system
Labels: k8s-app=kubernetes-dashboard
Annotations: <none>
Selector: k8s-app=kubernetes-dashboard
Type: ClusterIP
IP: 10.110.236.88
Port: <unset> 443/TCP
TargetPort: 8443/TCP
Endpoints: 10.244.0.7:8443
Session Affinity: None
Events: <none>
the docker ps on my master running the pod
[root#pp-tmp-test20 ~]# Docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
40edddf7a910 f9aed6605b81 "/dashboard --inse..." 7 minutes ago Up 7 minutes k8s_kubernetes-dashboard_kubernetes-dashboard-5698d5bc9-ql6q8_kube-system_f785d4bd-2e67-4daa-9f6c-19f98582fccb_1
e7f3820f1cf2 k8s.gcr.io/pause:3.1 "/pause" 7 minutes ago Up 7 minutes k8s_POD_kubernetes-dashboard-5698d5bc9-ql6q8_kube-system_f785d4bd-2e67-4daa-9f6c-19f98582fccb_0
[root#pp-tmp-test20 ~]# docker logs 40edddf7a910
2019/08/14 14:58:43 Starting overwatch
2019/08/14 14:58:43 Using in-cluster config to connect to apiserver
2019/08/14 14:58:43 Using service account token for csrf signing
2019/08/14 14:58:44 Successful initial request to the apiserver, version: v1.15.2
2019/08/14 14:58:44 Generating JWE encryption key
2019/08/14 14:58:44 New synchronizer has been registered: kubernetes-dashboard-key-holder-kube-system. Starting
2019/08/14 14:58:44 Starting secret synchronizer for kubernetes-dashboard-key-holder in namespace kube-system
2019/08/14 14:58:44 Initializing JWE encryption key from synchronized object
2019/08/14 14:58:44 Creating in-cluster Heapster client
2019/08/14 14:58:44 Metric client health check failed: the server could not find the requested resource (get services heapster). Retrying in 30 seconds.
2019/08/14 14:58:44 Auto-generating certificates
2019/08/14 14:58:44 Successfully created certificates
2019/08/14 14:58:44 Serving securely on HTTPS port: 8443
2019/08/14 14:59:14 Metric client health check failed: the server could not find the requested resource (get services heapster). Retrying in 30 seconds.
2019/08/14 14:59:44 Metric client health check failed: the server could not find the requested resource (get services heapster). Retrying in 30 seconds.
2019/08/14 15:00:14 Metric client health check failed: the server could not find the requested resource (get services heapster). Retrying in 30 seconds.
1/ On my master I start the proxy
[root#pp-tmp-test20 ~]# kubectl proxy
Starting to serve on 127.0.0.1:8001
2/ I launch firefox with x11 redirect from my master and hit this url
http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/#!/login
this is the error message I get in the browser
Error: 'dial tcp 10.244.0.7:8443: connect: no route to host'
Trying to reach: 'https://10.244.0.7:8443/'
In the same time i got these errors from the console where I launched the proxy
I0814 16:10:05.836114 20240 log.go:172] http: proxy error: context canceled
I0814 16:10:06.198701 20240 log.go:172] http: proxy error: context canceled
I0814 16:13:21.708190 20240 log.go:172] http: proxy error: unexpected EOF
I0814 16:13:21.708229 20240 log.go:172] http: proxy error: unexpected EOF
I0814 16:13:21.708270 20240 log.go:172] http: proxy error: unexpected EOF
I0814 16:13:39.335483 20240 log.go:172] http: proxy error: context canceled
I0814 16:13:39.716360 20240 log.go:172] http: proxy error: context canceled
but after refresh n times (randomly) the browser I'm able to reach the login interface to enter the token (created before)
Dashboard_login
But... the same error occur again
Dashboard_login_error
After hit n times the 'sign in' button I'm able to get the dashboard.. for few seconds.
dashboard_interface_1
dashboard_interface_2
after that the dashboard start to produce the same errors when I'm am exploring the interface:
dashboard_interface_error_1
dashboard_interface_error_2
I looked the pod logs, we can see some trafic :
[root#pp-tmp-test20 ~]# kubectl logs kubernetes-dashboard-5698d5bc9-ql6q8 -n kube-system
2019/08/14 14:16:56 Getting list of all services in the cluster
2019/08/14 14:16:56 [2019-08-14T14:16:56Z] Outcoming response to 10.244.0.1:56140 with 200 status code
2019/08/14 14:17:01 Metric client health check failed: the server could not find the requested resource (get services heapster). Retrying in 30 seconds.
2019/08/14 14:17:22 [2019-08-14T14:17:22Z] Incoming HTTP/2.0 GET /api/v1/login/status request from 10.244.0.1:56140: {}
2019/08/14 14:17:22 [2019-08-14T14:17:22Z] Outcoming response to 10.244.0.1:56140 with 200 status code
2019/08/14 14:17:22 [2019-08-14T14:17:22Z] Incoming HTTP/2.0 GET /api/v1/csrftoken/token request from 10.244.0.1:56140: {}
2019/08/14 14:17:22 [2019-08-14T14:17:22Z] Outcoming response to 10.244.0.1:56140 with 200 status code
2019/08/14 14:17:22 [2019-08-14T14:17:22Z] Incoming HTTP/2.0 POST /api/v1/token/refresh request from 10.244.0.1:56140: { contents hidden }
2019/08/14 14:17:22 [2019-08-14T14:17:22Z] Outcoming response to 10.244.0.1:56140 with 200 status code
2019/08/14 14:17:22 [2019-08-14T14:17:22Z] Incoming HTTP/2.0 GET /api/v1/settings/global/cani request from 10.244.0.1:56140: {}
2019/08/14 14:17:22 [2019-08-14T14:17:22Z] Outcoming response to 10.244.0.1:56140 with 200 status code
2019/08/14 14:17:22 [2019-08-14T14:17:22Z] Incoming HTTP/2.0 GET /api/v1/settings/global request from 10.244.0.1:56140: {}
2019/08/14 14:17:22 Cannot find settings config map: configmaps "kubernetes-dashboard-settings" not found
and again the pod logs
[root#pp-tmp-test20 ~]# kubectl logs kubernetes-dashboard-5698d5bc9-ql6q8 -n kube-system
Error from server: Get https://172.31.68.220:10250/containerLogs/kube-system/kubernetes-dashboard-5698d5bc9-ql6q8/kubernetes-dashboard: Forbidden
What I'm doing wrong ? Could you please tell me some investigating way ?
EDIT :
my service account that I used
# cat dashboard-adminuser.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: admin-user
namespace: kube-system
# cat dashboard-adminuser-ClusterRoleBinding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: admin-user
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: admin-user
namespace: kube-system
It seams heapster is deprecated with kubernetes in favor of metrics-server: Support metrics API #2986 & Heapster Deprecation Timeline .
I have deployed a dashboard that use heapster. This dashboard version is not compatible with my kubernetes version (1.15). So possible way to resolve the issue: install dashboard v2.0.0-beta3
# kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta3/aio/deploy/recommended.yaml
It seems that the serviceaccount kubernetes-dashboard doesn't have access to all kubernetes resources because it was bound to kubernetes-dashboard-minimal role. If you bind the service account to cluster-admin role , you won't get such issues. Below YAML file can be used to achieve this.
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: kubernetes-dashboard
labels:
k8s-app: kubernetes-dashboard
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: kubernetes-dashboard
namespace: kube-system
kube-apiserver.service is running with --authorization-mode=Node,RBAC
$ kubectl api-versions | grep rbac
rbac.authorization.k8s.io/v1
rbac.authorization.k8s.io/v1beta1
Believe this is enough to enable RBAC.
However, any new user i create can view all resources without any rolebindings.
Steps to create new user:
$ cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes nonadmin-csr.json | cfssljson -bare nonadmin
$ kubectl config set-cluster nonadmin --certificate-authority ca.pem --server https://127.0.0.1:6443
$ kubectl config set-credentials nonadmin --client-certificate nonadmin.pem --client-key nonadmin-key.pem
$ kubectl config set-context nonadmin --cluster nonadmin --user nonadmin
$ kubectl config use-context nonadmin
User nonadmin can view pods, svc without any rolebindings
$ kubectl get svc --all-namespaces
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes ClusterIP 10.32.0.1 <none> 443/TCP 5d4h
ingress-nginx ingress-nginx NodePort 10.32.0.129 <none> 80:30989/TCP,443:30686/TCP 5d3h
kube-system calico-typha ClusterIP 10.32.0.225 <none> 5473/TCP 5d3h
kube-system kube-dns ClusterIP 10.32.0.10 <none> 53/UDP,53/TCP 5d3h
rook-ceph rook-ceph-mgr ClusterIP 10.32.0.2 <none> 9283/TCP 4d22h
rook-ceph rook-ceph-mgr-dashboard ClusterIP 10.32.0.156 <none> 8443/TCP 4d22h
rook-ceph rook-ceph-mon-a ClusterIP 10.32.0.55 <none> 6790/TCP 4d22h
rook-ceph rook-ceph-mon-b ClusterIP 10.32.0.187 <none> 6790/TCP 4d22h
rook-ceph rook-ceph-mon-c ClusterIP 10.32.0.128 <none> 6790/TCP 4d22h
Version:
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.2", GitCommit:"cff46ab41ff0bb44d8584413b598ad8360ec1def", GitTreeState:"clean", BuildDate:"2019-01-10T23:35:51Z", GoVersion:"go1.11.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.2", GitCommit:"cff46ab41ff0bb44d8584413b598ad8360ec1def", GitTreeState:"clean", BuildDate:"2019-01-10T23:28:14Z", GoVersion:"go1.11.4", Compiler:"gc", Platform:"linux/amd64"}
This is an unmanaged kubernetes setup on Ubuntu 18 VMs.
Where am i going wrong?
Edit1: Adding kubectl config view
$ kubectl config view
apiVersion: v1
clusters:
- cluster:
certificate-authority: /home/dadmin/ca.pem
server: https://192.168.1.111:6443
name: gabbar
- cluster:
certificate-authority: /home/dadmin/ca.pem
server: https://127.0.0.1:6443
name: nonadmin
- cluster:
certificate-authority: /home/dadmin/ca.pem
server: https://192.168.1.111:6443
name: kubernetes
contexts:
- context:
cluster: gabbar
namespace: testing
user: gabbar
name: gabbar
- context:
cluster: nonadmin
user: nonadmin
name: nonadmin
- context:
cluster: kubernetes
user: admin
name: kubernetes
current-context: nonadmin
kind: Config
preferences: {}
users:
- name: admin
user:
client-certificate: /home/dadmin/admin.pem
client-key: /home/dadmin/admin-key.pem
- name: gabbar
user:
client-certificate: /home/dadmin/gabbar.pem
client-key: /home/dadmin/gabbar-key.pem
- name: nonadmin
user:
client-certificate: /home/dadmin/nonadmin.pem
client-key: /home/dadmin/nonadmin-key.pem
Edit 2:
Solution as suggested by #VKR:
cat > operator-csr.json <<EOF
{
"CN": "operator",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "IN",
"L": "BGLR",
"O": "system:view", <==== HERE
"OU": "CKA"
}
]
}
EOF
cfssl gencert \
-ca=ca.pem \
-ca-key=ca-key.pem \
-config=ca-config.json \
-profile=kubernetes \
operator-csr.json | cfssljson -bare operator
MasterNode~$ kubectl config set-cluster operator --certificate-authority ca.pem --server $SERVER
Cluster "operator" set.
MasterNode~$ kubectl config set-credentials operator --client-certificate operator.pem --client-key operator-key.pem
User "operator" set.
MasterNode~$ kubectl config set-context operator --cluster operator --user operator
Context "operator" created.
MasterNode~$ kubectl auth can-i get pods --as operator
no
MasterNode~$ kubectl create rolebinding operator --clusterrole view --user operator -n default --save-config
rolebinding.rbac.authorization.k8s.io/operator created
MasterNode~$ cat crb-view.yml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: view
subjects:
- kind: User
name: operator
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: view
apiGroup: rbac.authorization.k8s.io
MasterNode~$ kubectl create -f crb-view.yml --record --save-config
clusterrolebinding.rbac.authorization.k8s.io/view created
MasterNode~$ kubectl auth can-i get pods --as operator --all-namespaces
yes
MasterNode~$ kubectl auth can-i create pods --as operator --all-namespaces
no
MasterNode~$ kubectl config use-context operator
Switched to context "operator".
MasterNode~$ kubectl auth can-i "*" "*"
no
MasterNode~$ kubectl run db --image mongo
kubectl run --generator=deployment/apps.v1 is DEPRECATED and will be removed in a future version. Use kubectl run --generator=run-pod/v1 or kubectl create instead.
Error from server (Forbidden): deployments.apps is forbidden: User "operator" cannot create resource "deployments" in API group "apps" in the namespace "default"
Most probably root cause of such behavior is that use set "O": "system:masters" group while generating nonadmin-csr.json
system:masters group bounds to the cluster-admin super-user default role and as a result - every newly created user will have full access.
Here is a good article that provide you step-by-step instruction on how to create users with limited namespace access.
Quick test shows that similar users but with different groups have huge access differences
-subj "/CN=employee/O=testgroup" :
kubectl --context=employee-context get pods --all-namespaces
Error from server (Forbidden): pods is forbidden: User "employee" cannot list resource "pods" in API group "" at the cluster scope
-subj "/CN=newemployee/O=system:masters" :
kubectl --context=newemployee-context get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
ingress-nginx nginx-ingress-controller-797b884cbc-pckj6 1/1 Running 0 85d
ingress-nginx prometheus-server-8658d8cdbb-92629 1/1 Running 0 36d
kube-system coredns-86c58d9df4-gwk28 1/1 Running 0 92d
kube-system coredns-86c58d9df4-jxl84 1/1 Running 0 92d
kube-system etcd-kube-master-1 1/1 Running 0 92d
kube-system kube-apiserver-kube-master-1 1/1 Running 0 92d
kube-system kube-controller-manager-kube-master-1 1/1 Running 4 92d
kube-system kube-flannel-ds-amd64-k6sgd 1/1 Running 0 92d
kube-system kube-flannel-ds-amd64-mtrnc 1/1 Running 0 92d
kube-system kube-flannel-ds-amd64-zdzjl 1/1 Running 1 92d
kube-system kube-proxy-4pm27 1/1 Running 1 92d
kube-system kube-proxy-ghc7w 1/1 Running 0 92d
kube-system kube-proxy-wsq4h 1/1 Running 0 92d
kube-system kube-scheduler-kube-master-1 1/1 Running 4 92d
kube-system tiller-deploy-5b7c66d59c-6wx89 1/1 Running 0 36d