I set up a Ceph cluster and mounted manually using the sudo mount -t command following the official documentation, and I checked the status of my Ceph cluster - no problems there. Now I am trying to mount my CephFS on Kubernetes but my pod is stuck in ContainerCreating when I run the kubectl create command because it is failing to mount. I looked at many related problems/solutions online but nothing works.
As reference, I am following this guide: https://medium.com/velotio-perspectives/an-innovators-guide-to-kubernetes-storage-using-ceph-a4b919f4e469
My setup consists of 5 AWS instances, and they are as follows:
Node 1: Ceph Mon
Node 2: OSD1 + MDS
Node 3: OSD2 + K8s Master
Node 4: OSD3 + K8s Worker1
Node 5: CephFS + K8s Worker2
Is it okay to stack K8s on top of the same instance as Ceph? I am pretty sure that is allowed, but if that is not allowed, please let me know.
In the describe pod logs, this is the error/warning:
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /root/userone/kubelet/pods/bbf28924-3639-11ea-879d-0a6b51accf30/volumes/kubernetes.io~cephfs/pvc-4777686c-3639-11ea-879d-0a6b51accf30 --scope -- mount -t ceph -o name=kubernetes-dynamic-user-4d05a2df-3639-11ea-b2d3-5a4147fda646,secret=AQC4whxeqQ9ZERADD2nUgxxOktLE1OIGXThBmw== 172.31.15.110:6789:/pvc-volumes/kubernetes/kubernetes-dynamic-pvc-4d05a269-3639-11ea-b2d3-5a4147fda646 /root/userone/kubelet/pods/bbf28924-3639-11ea-879d-0a6b51accf30/volumes/kubernetes.io~cephfs/pvc-4777686c-3639-11ea-879d-0a6b51accf30
Output: Running scope as unit run-2382233.scope.
couldn't finalize options: -34
These are my .yaml files:
Provisioner:
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: test-provisioner-dt
namespace: test-dt
rules:
- apiGroups: [""]
resources: ["persistentvolumes"]
verbs: ["get", "list", "watch", "create", "delete"]
- apiGroups: [""]
resources: ["persistentvolumeclaims"]
verbs: ["get", "list", "watch", "update", "create"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["events"]
verbs: ["create", "update", "patch"]
- apiGroups: [""]
resources: ["services"]
resourceNames: ["kube-dns","coredns"]
verbs: ["list", "get"]
- apiGroups: [""]
resources: ["secrets"]
verbs: ["create", "get", "delete"]
- apiGroups: [""]
resources: ["endpoints"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: test-provisioner-dt
namespace: test-dt
subjects:
- kind: ServiceAccount
name: test-provisioner-dt
namespace: test-dt
roleRef:
kind: ClusterRole
name: test-provisioner-dt
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: test-provisioner-dt
namespace: test-dt
rules:
- apiGroups: [""]
resources: ["secrets"]
verbs: ["create", "get", "delete"]
- apiGroups: [""]
resources: ["endpoints"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
---
StorageClass:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: postgres-pv
namespace: test-dt
provisioner: ceph.com/cephfs
parameters:
monitors: 172.31.15.110:6789
adminId: admin
adminSecretName: ceph-secret-admin-dt
adminSecretNamespace: test-dt
claimRoot: /pvc-volumes
PVC:
apiVersion: v1
metadata:
name: postgres-pvc
namespace: test-dt
spec:
storageClassName: postgres-pv
accessModes:
- ReadWriteMany
resources:
requests:
storage: 2Gi
Output of kubectl get pv and kubectl get pvc show the volumes are bound and claimed, no errors.
Output of the provisioner pod logs all show success/no errors.
Please help!
Related
Hi I am trying to write a simple pipeline to delete some ecr images that clutter the repo. I want Jenkins to do it. I get error:
An error occurred (AccessDeniedException) when calling the BatchDeleteImage operation: User: arn:aws:sts::~:assumed-role/~cluster-nodegr-NodeInstanceRole-~/i-~ is not authorized to perform: ecr:BatchDeleteImage on resource: arn:aws:ecr:~:~:repository/~ because no identity-based policy allows the ecr:BatchDeleteImage action
Jenkins is running on k8s. I used similar yaml in addition to other yamls to get up and running:
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: jenkins
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: jenkins
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["create","delete","get","list","patch","update","watch"]
- apiGroups: [""]
resources: ["pods/exec"]
verbs: ["create","delete","get","list","patch","update","watch"]
- apiGroups: [""]
resources: ["pods/log"]
verbs: ["get","list","watch"]
- apiGroups: [""]
resources: ["events"]
verbs: ["watch"]
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: jenkins
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: jenkins
subjects:
- kind: ServiceAccount
name: jenkins
The pipeline looks like this:
pipeline {
agent {
kubernetes {
inheritFrom 'jenkins-slave'
}
}
stage('test') {
steps {
sh '''aws ecr batch-delete-image \
--repository-name <repo-name> \
--image-ids imageDigest=<img digest>
'''
}
}
}
I tried to add this:
- apiGroups: ["ecr"]
resources: ["*"]
verbs: ["batchDeleteImage"]
resourceNames:
- "*"
but didn't work.
I'm trying to deploy a java spring project on my local minikube using gitlab-ci pipeline.. but I keep getting
ERROR: Job failed (system failure): prepare environment: setting up credentials: secrets is forbidden: User "system:serviceaccount:maverick:default" cannot create resource "secrets" in API group "" in the namespace "maverick". Check https://docs.gitlab.com/runner/shells/index.html#shell-profile-loading for more information
I've installed gitlab-runner on the "maverick" namespace
apiVersion: v1
kind: ServiceAccount
metadata:
name: gitlab-runner
namespace: maverick
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: gitlab-runner
namespace: maverick
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["list", "get", "watch", "create", "delete"]
- apiGroups: [""]
resources: ["pods/exec"]
verbs: ["create"]
- apiGroups: [""]
resources: ["pods/log"]
verbs: ["get"]
- apiGroups: [""]
resources: ["pods/attach"]
verbs: ["list", "get", "create", "delete", "update"]
- apiGroups: [""]
resources: ["secrets"]
verbs: ["list", "get", "create", "delete", "update"]
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["list", "get", "watch", "create", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: gitlab-runner
namespace: maverick
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: maverick
subjects:
- namespace: maverick
kind: ServiceAccount
name: gitlab-runner
and the values
gitlabUrl: https://gitlab.com/
runnerRegistrationToken: ".... my token .... "
runners:
privileged: false
tags: k8s
serviceAccountName: gitlab-runner
My gitlab-ci.yml is like this:
docker-build-job:
stage: docker-build
image: $MAVEN_IMAGE
script:
- mvn jib:build -Djib.to.image=${CI_REGISTRY_IMAGE}:latest -Djib.to.auth.username=${CI_REGISTRY_USER} -Djib.to.auth.password=${CI_REGISTRY_PASSWORD}
deploy-job:
image: alpine/helm:3.2.1
stage: deploy
tags:
- k8s
script:
- helm upgrade ${APP_NAME} ./charts --install --values=./charts/values.yaml --namespace ${APP_NAME}
rules:
- if: $CI_COMMIT_BRANCH == 'master'
when: always
And the chart folder has the deployment.yaml like this:
apiVersion: apps/v1
kind: Deployment
metadata:
name: maverick
namespace: maverick
spec:
replicas: 1
selector:
matchLabels:
app: maverick
template:
metadata:
labels:
app: maverick
spec:
containers:
- name: maverick
image: registry.gitlab.com/gfalco77/maverick:latest
imagePullPolicy: IfNotPresent
ports:
- containerPort: 8001
imagePullSecrets:
- name: registry-credentials
---
apiVersion: v1
kind: Service
metadata:
name: maverick
spec:
ports:
- name: maverick
port: 8001
targetPort: 8001
protocol: TCP
selector:
app: maverick
There's also a registry-credentials which I created according to https://chris-vermeulen.com/using-gitlab-registry-with-kubernetes/ and they are installed in the maverick namespace
apiVersion: v1
kind: Secret
metadata:
name: registry-credentials
namespace: maverick
type: kubernetes.io/dockerconfigjson
data:
.dockerconfigjson: .. base64 creds ..
I can see the gitlab-runner has the permissions on apigroup "" for create.. but still it seems it can't download the image from the registry maybe, not sure what is wrong?
Thanks in advance
Problem solved adding the following ClusterRole and ClusterRoleBinding, especially the second one with name "default"
After this the job in gitlab continues and then tries to use the user system:serviceaccount:maverick:gitlab-runner , but it fails on something else I need to figure out
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: cluster-admin
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["list", "get", "watch", "create", "delete"]
- apiGroups: [""]
resources: ["pods/exec"]
verbs: ["create"]
- apiGroups: [""]
resources: ["pods/log"]
verbs: ["get"]
- apiGroups: [""]
resources: ["pods/attach"]
verbs: ["list", "get", "create", "delete", "update"]
- apiGroups: [""]
resources: ["secrets"]
verbs: ["list", "watch", "get", "create", "delete", "update"]
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["list", "get", "watch", "create", "delete", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: cluster-admin-role
subjects:
- kind: ServiceAccount
name: gitlab-runner
namespace: maverick
roleRef: # referring to your ClusterRole
kind: ClusterRole
name: cluster-admin
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: cluster-admin-role
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: default
namespace: maverick
I am trying to run a k8s deployment job from gitlab kubernetes executor.
I deployed the kubernetes runner using helm as following.
my values.yaml includes the following rbac rules:
rbac:
create: true
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["list", "get", "watch", "create", "delete"]
- apiGroups: [""]
resources: ["pods/exec"]
verbs: ["create"]
- apiGroups: [""]
resources: ["pods/log"]
verbs: ["get"]
- apiGroups: [""]
resources: ["pods/attach"]
verbs: ["list", "get", "create", "delete", "update"]
- apiGroups: [""]
resources: ["secrets"]
verbs: ["list", "get", "create", "delete", "update"]
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["list", "get", "create", "delete", "update"]
- apiGroups: [""]
resources: ["services"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["apps"]
resources: ["deployments"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
clusterWideAccess: true
podSecurityPolicy:
enabled: false
resourceNames:
- gitlab-runner
then
helm install --namespace gitlab gitlab-runner -f values.yaml gitlab/gitlab-runner
and, my .gitlab-ci.yml has the following stage:
script:
- mkdir -p /etc/deploy
- echo $kube_config |base64 -d > $KUBECONFIG
- sed -i "s/IMAGE_TAG/$CI_PIPELINE_ID/g" deployment.yaml
- cat deployment.yaml
- kubectl apply -f deployment.yaml
and, I got the following error in pipeline logs:
$ kubectl apply -f deployment.yaml
Error from server (Forbidden): error when retrieving current configuration of:
Resource: "apps/v1, Resource=deployments", GroupVersionKind: "apps/v1, Kind=Deployment"
Name: "java-demo", Namespace: "gitlab"
Object: &{map["apiVersion":"apps/v1" "kind":"Deployment" "metadata":map["annotations":map["kubectl.kubernetes.io/last-applied-configuration":""] "name":"java-demo" "namespace":"gitlab"] "spec":map["replicas":'\x01' "selector":map["matchLabels":map["app":"java-demo"]] "template":map["metadata":map["labels":map["app":"java-demo"]] "spec":map["containers":[map["image":"square2019/dummy-repo:555060965" "imagePullPolicy":"Always" "name":"java-demo" "ports":[map["containerPort":'\u1f90']]]]]]]]}
from server for: "deployment.yaml": deployments.apps "java-demo" is forbidden: User "system:serviceaccount:gitlab:default" cannot get resource "deployments" in API group "apps" in the namespace "gitlab"
Error from server (Forbidden): error when retrieving current configuration of:
Resource: "/v1, Resource=services", GroupVersionKind: "/v1, Kind=Service"
Name: "java-demo", Namespace: "gitlab"
Object: &{map["apiVersion":"v1" "kind":"Service" "metadata":map["annotations":map["kubectl.kubernetes.io/last-applied-configuration":""] "name":"java-demo" "namespace":"gitlab"] "spec":map["ports":[map["name":"java-demo" "port":'P' "targetPort":'\u1f90']] "selector":map["app":"java-demo"] "type":"ClusterIP"]]}
from server for: "deployment.yaml": services "java-demo" is forbidden: User "system:serviceaccount:gitlab:default" cannot get resource "services" in API group "" in the namespace "gitlab"
Cleaning up project directory and file based variables
00:00
ERROR: Job failed: command terminated with exit code 1
Am I missing some RBAC rules here?
thank you!
=== update 2022.06.04 =====
kubectl get role -n gitlab -o yaml
apiVersion: v1
items: []
kind: List
metadata:
resourceVersion: ""
=== update 2022.06.05 ===
Looking at the logic in https://gitlab.com/gitlab-org/charts/gitlab-runner/-/blob/main/templates/role.yaml, I modified values.yaml with
clusterWideAccess: false
and, now i get the role as:
k get role -n gitlab -o yaml
apiVersion: v1
items:
- apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
annotations:
meta.helm.sh/release-name: gitlab-runner
meta.helm.sh/release-namespace: gitlab
creationTimestamp: "2022-06-05T03:49:57Z"
labels:
app: gitlab-runner
app.kubernetes.io/managed-by: Helm
chart: gitlab-runner-0.41.0
heritage: Helm
release: gitlab-runner
name: gitlab-runner
namespace: gitlab
resourceVersion: "283754"
uid: 8040b295-c9fc-47cb-8c5c-74cbf6c4d8a7
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- list
- get
- watch
- create
- delete
- apiGroups:
- ""
resources:
- pods/exec
verbs:
- create
- apiGroups:
- ""
resources:
- pods/log
verbs:
- get
- apiGroups:
- ""
resources:
- pods/attach
verbs:
- list
- get
- create
- delete
- update
- apiGroups:
- ""
resources:
- secrets
verbs:
- list
- get
- create
- delete
- update
- apiGroups:
- ""
resources:
- configmaps
verbs:
- list
- get
- create
- delete
- update
- apiGroups:
- ""
resources:
- services
verbs:
- get
- list
- watch
- create
- update
- patch
- delete
- apiGroups:
- apps
resources:
- deployments
verbs:
- get
- list
- watch
- create
- update
- patch
- delete
kind: List
metadata:
resourceVersion: ""
service account and RoleBinding
k get sa -n gitlab
NAME SECRETS AGE
default 1 3d2h
gitlab-runner 1 2d2h
k get RoleBinding -n gitlab
NAME ROLE AGE
gitlab-runner Role/gitlab-runner 9h
however, the same error persists.
=== update 2022.06.06 ===
I applied the following to fix the issue for the moment
kubectl create rolebinding --namespace=gitlab gitlab-runner-4 --role=gitlab-runner --serviceaccount=gitlab:default
I am using helm charts to deploy Gitlab Runner into Kubernetes cluster. I want that the created pods when runner is triggered to have a costume services account instead of the default one. I did create role and cluster role and did the role bindings.
However, I am getting the following error when running a CI job
From Gitlab CI
Running with gitlab-runner 15.0.0 (cetx4b)
on initial-runner -P-d1RhT
Preparing the "kubernetes" executor
00:00
Using Kubernetes namespace: namespace_test
Using Kubernetes executor with image registry.gitlab.com/docker-images/ubuntu-base:latest ...
Using attach strategy to execute scripts...
Preparing environment
00:05
ERROR: Job failed (system failure): prepare environment: setting up build pod: Timed out while waiting for ServiceAccount/gitlab-runner to be present in the cluster. Check https://docs.gitlab.com/runner/shells/index.html#shell-profile-loading for more information
list roles and services accounts
# get rolebindings & clusterrolebindings
kubectl get rolebindings,clusterrolebindings -n namespace_test | grep gitlab-runner
# output
# rolebinding.rbac.authorization.k8s.io/gitlab-runner Role/gitlab-runner
# clusterrolebinding.rbac.authorization.k8s.io/gitlab-runner ClusterRole/gitlab-runner
---
# get serviceaccounts
kubectl get serviceaccounts -n namespace_test
# output
# NAME SECRETS AGE
# default 1 6h50m
# gitlab-runner 1 24m
# kubernetes-dashboard 1 6h50m
# mysql 2 6h49m
helm values
runners:
concurrent: 8
name: initial-runner
config: |
[[runners]]
[runners.kubernetes]
namespace = "namespace_test"
image = "registry.gitlab.com/docker-images/ubuntu-base:latest"
service_account = "gitlab-runner"
tags: base
rbac:
create: false
serviceAccountName: gitlab-runner
any ideas on how to solve this issue?
In my case, I forgot to give the "gitlab-runner" cluster role the right permissions on "serviceaccounts" resource.
Ensure the role that is attached to your Gitlab runner has the following specification:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: gitlab-runner
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["list", "get", "watch", "create", "delete"]
- apiGroups: [""]
resources: ["pods/exec"]
verbs: ["create"]
- apiGroups: [""]
resources: ["pods/log"]
verbs: ["get"]
- apiGroups: [""]
resources: ["secrets"]
verbs: ["list", "get", "create", "delete", "update"]
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["list", "get", "create", "delete", "update"]
- apiGroups: [""]
resources: ["pods/attach"]
verbs: ["list", "get", "create", "delete", "update"]
- apiGroups: [""]
resources: ["serviceaccounts"]
verbs: ["list", "get", "create", "delete", "update"]
I have a scaler service that was working fine, until my recent kubernetes version upgrade. Now I keep getting the following error. (some info redacted)
Error from server (Forbidden): deployments.extensions "redacted" is forbidden: User "system:serviceaccount:namesspace:saname" cannot get resource "deployments/scale" in API group "extensions" in the namespace "namespace"
I have below cluster role:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
app: redacted
chart: redacted
heritage: Tiller
release: redacted
name: redacted
rules:
- apiGroups:
- '*'
resources: ["configmaps", "endpoints", "services", "pods", "secrets", "namespaces", "serviceaccounts", "ingresses", "daemonsets", "statefulsets", "persistentvolumeclaims", "replicationcontrollers", "deployments", "replicasets"]
verbs: ["get", "list", "watch", "edit", "delete", "update", "scale", "patch", "create"]
- apiGroups:
- '*'
resources: ["nodes"]
verbs: ["list", "get", "watch"]
scale is a subresource, not a verb. Include "deployments/scale" in the resources list.