MountVolume.SetUp failed for volume "<volume-name>-token-m4rtn" : failed to sync secret cache: timed out waiting for the condition - kubernetes

I am having an issue on GKE where all this error is being spewed from all name spaces. Not sure what might be the issue or how to troubleshoot this.
message: "MountVolume.SetUp failed for "volume-name-token-m4rtn" : failed to sync secret cache: timed out waiting for the condition"
It occurs for almost all pods in all namespaces. Has anyone come across this or have any ideas on how I can troubleshoot?

The error you are receiving points to be a problem with RBAC(Role-based access control) permissions, looks like the service account used by the pod does not have enough permissions.
Hence, the default service account within the namespace you are deploying to is not authorized to mount the secret that you are trying to mount into your Pod.
You can get further information on the following link Using RBAC Authorization
You also can take a look at the Google’s documentation
For example, the following Role grants read access (get, watch, and list) to all pods in the accounting Namespace:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: accounting
name: pod-reader
rules:
- apiGroups: [""] # "" indicates the core API group
resources: ["pods"]
verbs: ["get", "watch", "list"]
Also you can take a look at the following similar cases Case in Reddit, StackOverflow case

Related

Forbidden after enabling Google Cloud Groups RBAC in GKE

We are enabling Google Cloud Groups RBAC in our existing GKE clusters.
For that, we first created all the groups in Workspace, and also the required "gke-security-groups#ourdomain.com" according to documentation.
Those groups are created in Workspace with an integration with Active Directory for Single Sign On.
All groups are members of "gke-security-groups#ourdomain" as stated by documentation. And all groups can View members.
The cluster was updated to enabled the flag for Google Cloud Groups RBAC and we specify the value to be "gke-security-groups#ourdomain.com".
We then Added one of the groups (let's called it group_a#ourdomain.com) to IAM and assigned a custom role which only gives access to:
"container.apiServices.get",
"container.apiServices.list",
"container.clusters.getCredentials",
"container.clusters.get",
"container.clusters.list",
This is just the minimum for the user to be able to log into the Kubernetes cluster and from there being able to apply Kubernetes RBACs.
In Kubernetes, we applied a Role, which provides list of pods in a specific namespace, and a role binding that specifies the group we just added to IAM.
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: test-role
namespace: custom-namespace
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list", "watch"]
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: test-rolebinding
namespace: custom-namespace
roleRef:
kind: Role
name: test-role
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: Group
name: group_a#ourdomain.com
Everything looks good until now. But when trying to list the pods of this namespace with the user that belongs to the group "group_a#ourdomain.com", we get:
Error from server (Forbidden): pods is forbidden: User
"my-user#ourdomain.com" cannot list resource "pods" in API group ""
in the namespace "custom-namespace": requires one of ["container.pods.list"]
permission(s).
Of course if I give container.pods.list to the group_a#ourdomain assigned role, I can list pods, but it opens for all namespaces, as this permission in GCloud is global.
What am I missing here?
Not sure if this is relevant, but our organisation in gcloud is called for example "my-company.io", while the groups for SSO are named "...#groups.my-company.io", and the gke-security-groups group was also created with the "groups.my-company.io" domain.
Also, if instead of a Group in the RoleBinding, I specify the user directly, it works.
It turned out to be an issue about case-sensitive strings and nothing related with the actual rules defined in the RBACs, which were working as expected.
The names of the groups were created in Azure AD with a camel case model. These group names where then showed in Google Workspace all lowercase.
Example in Azure AD:
thisIsOneGroup#groups.mycompany.com
Example configured in the RBACs as shown in Google Workspace:
thisisonegroup#groups.mycompany.com
We copied the names from the Google Workspace UI all lowercase and we put them in the bindings and that caused the issue. Kubernetes GKE is case sensitive and it didn't match the name configured in the binding with the email configured in Google Workspace.
After changing the RBAC bindings to have the same format, everything worked as expected.
Looks like you are trying to grant access to deployments in the extensions and apps API groups. That requires the user to specify the extensions and apps api group in your role rules:
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- '*'
- apiGroups:
- extensions
- apps
resources:
- deployments
- replicasets
verbs:
- '*'
I can recommend you to recreate role and role bindings too. You can visit the following thread as a reference too RBAC issue : Error from server (Forbidden):
Edited 012622:
Can you please confirm that you provided the credentials or configuration file (manifest, YAML)? As you may know, this information is provided by Kubernetes and the default service account. You can verify it by running:
$ kubectl auth can-i get pods
Let me tell you that the account type you need to use for your accounts is “service account”. To create a new service account with a wider set of permissions, the following is a YAML example:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: default
name: pod-read-role
rules:
- apiGroups: [""] # "" indicates the core API group
resources: ["pods"]
verbs: ["get", "watch", "list"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: pod-read-sa
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: pod-read-rolebinding
namespace: default
subjects:
- kind: ServiceAccount
name: pod-read-sa
apiGroup: ""
roleRef:
kind: Role
name: pod-read-role
apiGroup: ""
Please use the following thread as a reference.

Serviceaccount is unable to create persistentvolumeclaim(PVC)

We're testing Shiny-proxy Kubernetes containers and each application spins it's own container, it's working fine until this part. We have made some changes to create a PVC/PV to persist user specific data for each container, noticed that serviceaccount is unable to create the PVC though I have following roles configured for the account. In general, are there any other steps for making sure that SA is able to access/create PVC?
The PV/PVC are accessible when testing from a normal container, but seem to be an issue with the serviceaccount roles/permissions that's used to create new containers.
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
namespace: sp-ns
name: sp-sa
rules:
- apiGroups: [""]
resources: ["pods", "pods/log", "persistentvolumeclaims"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
I have verfied that the serviceaccount roles are set right as below commands returns 'yes'.
kubectl auth can-i create pvc --as=system:serviceaccount:sp-ns:sp-sa -n sp-ns
Error during container creation from the application:
at io.undertow.server.HttpServerExchange$1.run(HttpServerExchange.java:830)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: http://localhost:8001/api/v1/namespaces/sp-ns/pods. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. pods "sp-pod-92e1efc0-0859-4a87-8b9b-04d6adaa11f5" is forbidden: user "system:serviceaccount:sp-ns:sp-sa" is not an admin and does not have permissions to use host bind mounts for resource .
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:503)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:440)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:406)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:365)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:234)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:735)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:325)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:321)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.lambda$createNew$0(BaseOperation.java:336)
at io.fabric8.kubernetes.api.model.DoneablePod.done(DoneablePod.java:26)
at eu.openanalytics.containerproxy.backend.kubernetes.KubernetesBackend.startContainer(KubernetesBackend.java:223)
at eu.openanalytics.containerproxy.backend.AbstractContainerBackend.doStartProxy(AbstractContainerBackend.java:129)
at eu.openanalytics.containerproxy.backend.AbstractContainerBackend.startProxy(AbstractContainerBackend.java:110)
... 95 more
Container is not running as privileged. Use privileged: true in pod spec.
Service account don't have cluster-admin role. Use below to provide permission.
kubectl create clusterrolebinding add-on-cluster-admin --clusterrole=cluster-admin --serviceaccount=sp-ns:sp-sa

Automatically create Kubernetes resources after namespace creation

I have 2 teams:
devs: they create a new Kubernetes namespace each time they deploy a branch/tag of their app
ops: they manage access control to the cluster with (cluster)roles and (cluster)rolebindings
The problem is that 'devs' cannot kubectl their namespaces until 'ops' have created RBAC resources. And 'devs' cannot create RBAC resources themselves as they don't have the list of subjects to put in the rolebinding resource (sharing the list is not an option).
I have read the official documentation about Admission webhooks but what I understood is that they only act on the resource that triggered the webhook.
Is there a native and/or simple way in Kubernetes to apply resources whenever a new namespace is created?
I've come up with a solution by writing a custom controller.
With the following custom resource deployed, the controller injects the role and rolebinding in namespaces matching dev-.* and fix-.*:
kind: NamespaceResourcesInjector
apiVersion: blakelead.com/v1alpha1
metadata:
name: nri-test
spec:
namespaces:
- dev-.*
- fix-.*
resources:
- |
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: dev-role
rules:
- apiGroups: [""]
resources: ["pods","pods/portforward", "services", "deployments", "ingresses"]
verbs: ["list", "get"]
- apiGroups: [""]
resources: ["pods/portforward"]
verbs: ["create"]
- apiGroups: [""]
resources: ["namespaces"]
verbs: ["list", "get"]
- |
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: dev-rolebinding
subjects:
- kind: User
name: dev
roleRef:
kind: Role
name: dev-role
apiGroup: rbac.authorization.k8s.io
The controller is still in early stages of development but I'm using it successfully in more and more clusters.
Here it is for those interested: https://github.com/blakelead/nsinjector
Yes, there is a native way but not an out of the box feature.
You can do what you have described by using/creating an operator. Essentially extending Kubernetes APIs for your need.
As operator is just an open pattern which can implement things in many ways, in the scenario you gave one way the control flow could look like could be:
Operator with privileges to create RBAC is deployed and subscribed to changes to a k8s namespace object kind
Devs create namespace containing an agreed label
Operator is notified about changes to the cluster
Operator checks namespace validation (this can also be done by a separate admission webhook)
Operator creates RBAC in the newly created namespace
If RBACs are cluster wide, same operator can do the RBAC cleanup once namespace is deleted
It's kind of related to how the user is authenticated to the cluster and how they get a kubeconfig file.You can put a group in the client certificate or the bearer token that kubectl uses from the kubeconfig. Ahead of time you can define a clusterrole having a clusterrolebinding to that group which gives them permission to certain verbs on certain resources(for example ability to create namespace)
Additionally you can use an admission webhook to validate if the user is supposed to be part of that group or not.

I have an error while running jobs with GItLab CI/CD kubernetes executor within the cluster

ERROR: Job failed (system failure): pods is forbidden: User "system:serviceaccount:gitlab:gitlab-admin" cannot create resource "pods" in API group "" in the namespace "gitlab"
From the looks of it, because you did not provided enough information I would say that your RBAC is incorrectly configure.
I would advice to read following Kubernetes documentation regarding Managing Service Accounts and Configure Service Accounts for Pods.
If I'm not mistaken this command should fix it:
kubectl create clusterrolebinding gitlab-cluster-admin --clusterrole=cluster-admin --group=system:serviceaccounts --namespace=gitlab
If not then you will need to edit your Role and ClusterRole with something like the following:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: gitlab
name: gitlab-admin
rules:
- apiGroups: [""] # "" indicates the core API group
resources: ["pods"]
verbs: ["create", "get", "watch", "list"]
This is an example and you should make changes to better suit your needs.
If you provide more details I'll try to help you further.

ManageIQ displays no info for Kubernetes container provider and cluster-reader ClusterRole is missing

I have a Kubernetes v1.9.3 (no OpenShift) cluster I'd like to manage with ManageIQ (gaprindashvili-3 running as a Docker container).
I prepared the k8s cluster to interact with ManageIQ following these instructions. Notice that I performed the steps listed in the last section only (Prepare cluster for use with ManageIQ), as the previous ones were for setting up a k8s cluster and I already had a running one.
I successfully added the k8s container provider to ManageIQ, but the dashboard reports nothing: 0 nodes, 0 pods, 0 services, etc..., while I do have nodes, services and running pods on the cluster. I looked at the content of /var/log/evm.log of ManageIQ and found this error:
[----] E, [2018-06-21T10:06:40.397410 #13333:6bc9e80] ERROR – : [KubeException]: events is forbidden: User “system:serviceaccount:management-infra:management-admin” cannot list events at the cluster scope: clusterrole.rbac.authorization.k8s.io “cluster-reader” not found Method:[block in method_missing]
So the ClusterRole cluster-reader was not defined in the cluster. I double checked with kubectl get clusterrole cluster-reader and it confirmed that cluster-reader was missing.
As a solution, I tried to create cluster-reader manually. I could not find any reference of it in the k8s doc, while it is mentioned in the OpenShift docs. So I looked at how cluster-reader was defined in OpenShift v3.9. Its definition changes across different OpenShift versions, I picked 3.9 as it is based on k8s v1.9 which is the one I'm using. So here's what I found in the OpenShift 3.9 doc:
Name: cluster-reader
Labels: <none>
Annotations: authorization.openshift.io/system-only=true
rbac.authorization.kubernetes.io/autoupdate=true
PolicyRule:
Resources Non-Resource URLs Resource Names Verbs
--------- ----------------- -------------- -----
[*] [] [get]
apiservices.apiregistration.k8s.io [] [] [get list watch]
apiservices.apiregistration.k8s.io/status [] [] [get list watch]
appliedclusterresourcequotas [] [] [get list watch]
I wrote the following yaml definition to create an equivalent ClusterRole in my cluster:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: cluster-reader
rules:
- apiGroups: ["apiregistration"]
resources: ["apiservices.apiregistration.k8s.io", "apiservices.apiregistration.k8s.io/status"]
verbs: ["get", "list", "watch"]
- nonResourceURLs: ["*"]
verbs: ["get"]
I didn't include appliedclusterresourcequotas among the monitored resources because it's my understanding that is an OpenShift-only resource (but I may be mistaken).
I deleted the old k8s container provider on ManageIQ and created a new one after having created cluster-reader, but nothing changed, the dashboard still displays nothing (0 nodes, 0 pods, etc...). I looked at the content of /var/log/evm.log in ManageIQ and this time these errors were reported:
[----] E, [2018-06-22T11:15:39.041903 #2942:7e5e1e0] ERROR -- : MIQ(ManageIQ::Providers::Kubernetes::ContainerManager::EventCatcher::Runner#start_event_monitor) EMS [kubernetes-01] as [] Event Monitor Thread aborted because [events is forbidden: User "system:serviceaccount:management-infra:management-admin" cannot list events at the cluster scope]
[----] E, [2018-06-22T11:15:39.042455 #2942:7e5e1e0] ERROR -- : [KubeException]: events is forbidden: User "system:serviceaccount:management-infra:management-admin" cannot list events at the cluster scope Method:[block in method_missing]
So what am I doing wrong? How can I fix this problem?
If it can be of any use, here you can find the whole .yaml file I'm using to set up the k8s cluster to interact with ManageIQ (all the required namespaces, service accounts, cluster role bindings are present as well).
For the ClusterRole to take effect it must be bound to the group management-infra or user management-admin.
Example of creating group binding:
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: read-cluster-state
subjects:
- kind: Group
name: management-infra
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: cluster-reader
apiGroup: rbac.authorization.k8s.io
After applying this file changes will take place immediately. No need to restart cluster.
See more information here.