CephFS Unable to attach or mount volumes: unmounted volumes=[image-store] - kubernetes

I'm having trouble getting my Kube-registry up and running on cephfs. I'm using rook to set this cluster up. As you can see, I'm having trouble attaching the volume. Any idea what would be causing this issue? any help is appreciated.
kube-registry.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: cephfs-pvc
namespace: kube-system
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
storageClassName: rook-cephfs
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: kube-registry
namespace: kube-system
labels:
k8s-app: kube-registry
kubernetes.io/cluster-service: "true"
spec:
replicas: 3
selector:
matchLabels:
k8s-app: kube-registry
template:
metadata:
labels:
k8s-app: kube-registry
kubernetes.io/cluster-service: "true"
spec:
containers:
- name: registry
image: registry:2
imagePullPolicy: Always
resources:
limits:
cpu: 100m
memory: 100Mi
env:
# Configuration reference: https://docs.docker.com/registry/configuration/
- name: REGISTRY_HTTP_ADDR
value: :5000
- name: REGISTRY_HTTP_SECRET
value: "Ple4seCh4ngeThisN0tAVerySecretV4lue"
- name: REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY
value: /var/lib/registry
volumeMounts:
- name: image-store
mountPath: /var/lib/registry
ports:
- containerPort: 5000
name: registry
protocol: TCP
livenessProbe:
httpGet:
path: /
port: registry
readinessProbe:
httpGet:
path: /
port: registry
volumes:
- name: image-store
persistentVolumeClaim:
claimName: cephfs-pvc
readOnly: false
Storagelass.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-cephfs
# Change "rook-ceph" provisioner prefix to match the operator namespace if needed
provisioner: rook-ceph.cephfs.csi.ceph.com
parameters:
# clusterID is the namespace where operator is deployed.
clusterID: rook-ceph
# CephFS filesystem name into which the volume shall be created
fsName: myfs
# Ceph pool into which the volume shall be created
# Required for provisionVolume: "true"
pool: myfs-data0
# Root path of an existing CephFS volume
# Required for provisionVolume: "false"
# rootPath: /absolute/path
# The secrets contain Ceph admin credentials. These are generated automatically by the operator
# in the same namespace as the cluster.
csi.storage.k8s.io/provisioner-secret-name: rook-csi-cephfs-provisioner
csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
csi.storage.k8s.io/node-stage-secret-name: rook-csi-cephfs-node
csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
reclaimPolicy: Deletea
kubectl describe pods --namespace=kube-system kube-registry-58659ff99b-j2b4d
Name: kube-registry-58659ff99b-j2b4d
Namespace: kube-system
Priority: 0
Node: minikube/192.168.99.212
Start Time: Wed, 25 Nov 2020 13:19:35 -0500
Labels: k8s-app=kube-registry
kubernetes.io/cluster-service=true
pod-template-hash=58659ff99b
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/kube-registry-58659ff99b
Containers:
registry:
Container ID:
Image: registry:2
Image ID:
Port: 5000/TCP
Host Port: 0/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Limits:
cpu: 100m
memory: 100Mi
Requests:
cpu: 100m
memory: 100Mi
Liveness: http-get http://:registry/ delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:registry/ delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
REGISTRY_HTTP_ADDR: :5000
REGISTRY_HTTP_SECRET: Ple4seCh4ngeThisN0tAVerySecretV4lue
REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY: /var/lib/registry
Mounts:
/var/lib/registry from image-store (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-nw4th (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
image-store:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: cephfs-pvc
ReadOnly: false
default-token-nw4th:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-nw4th
Optional: false
QoS Class: Guaranteed
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 13m (x3 over 13m) default-scheduler running "VolumeBinding" filter plugin for pod "kube-registry-58659ff99b-j2b4d": pod has unbound immediate PersistentVolumeClaims
Normal Scheduled 13m default-scheduler Successfully assigned kube-system/kube-registry-58659ff99b-j2b4d to minikube
Warning FailedMount 2m6s (x5 over 11m) kubelet, minikube Unable to attach or mount volumes: unmounted volumes=[image-store], unattached volumes=[image-store default-token-nw4th]: timed out waiting for the condition
Warning FailedAttachVolume 59s (x6 over 11m) attachdetach-controller AttachVolume.Attach failed for volume "pvc-6eeff481-eb0a-4269-84c7-e744c9d639d9" : attachdetachment timeout for volume 0001-0009-rook-c
ceph provisioner logs, I restarted my cluster so the name will be different but output is the same
I1127 18:27:19.370543 1 csi-provisioner.go:121] Version: v2.0.0
I1127 18:27:19.370948 1 csi-provisioner.go:135] Building kube configs for running in cluster...
I1127 18:27:19.429190 1 connection.go:153] Connecting to unix:///csi/csi-provisioner.sock
I1127 18:27:21.561133 1 common.go:111] Probing CSI driver for readiness
W1127 18:27:21.905396 1 metrics.go:142] metrics endpoint will not be started because `metrics-address` was not specified.
I1127 18:27:22.060963 1 leaderelection.go:243] attempting to acquire leader lease rook-ceph/rook-ceph-cephfs-csi-ceph-com...
I1127 18:27:22.122303 1 leaderelection.go:253] successfully acquired lease rook-ceph/rook-ceph-cephfs-csi-ceph-com
I1127 18:27:22.323990 1 controller.go:820] Starting provisioner controller rook-ceph.cephfs.csi.ceph.com_csi-cephfsplugin-provisioner-797b67c54b-42jwc_4e14295b-f73d-4b94-bae9-ff4f2639b487!
I1127 18:27:22.324061 1 clone_controller.go:66] Starting CloningProtection controller
I1127 18:27:22.324205 1 clone_controller.go:84] Started CloningProtection controller
I1127 18:27:22.325240 1 volume_store.go:97] Starting save volume queue
I1127 18:27:22.426790 1 controller.go:869] Started provisioner controller rook-ceph.cephfs.csi.ceph.com_csi-cephfsplugin-provisioner-797b67c54b-42jwc_4e14295b-f73d-4b94-bae9-ff4f2639b487!
I1127 19:08:39.850493 1 controller.go:1317] provision "kube-system/cephfs-pvc" class "rook-cephfs": started
I1127 19:08:39.851034 1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"kube-system", Name:"cephfs-pvc", UID:"7c47bda7-0c7b-4ca0-b6d0-19d717ef2e06", APIVersion:"v1", ResourceVersion:"7744", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "kube-system/cephfs-pvc"
I1127 19:08:43.670226 1 controller.go:1420] provision "kube-system/cephfs-pvc" class "rook-cephfs": volume "pvc-7c47bda7-0c7b-4ca0-b6d0-19d717ef2e06" provisioned
I1127 19:08:43.670262 1 controller.go:1437] provision "kube-system/cephfs-pvc" class "rook-cephfs": succeeded
E1127 19:08:43.692108 1 controller.go:1443] couldn't create key for object pvc-7c47bda7-0c7b-4ca0-b6d0-19d717ef2e06: object has no meta: object does not implement the Object interfaces
I1127 19:08:43.692189 1 controller.go:1317] provision "kube-system/cephfs-pvc" class "rook-cephfs": started
I1127 19:08:43.692205 1 controller.go:1326] provision "kube-system/cephfs-pvc" class "rook-cephfs": persistentvolume "pvc-7c47bda7-0c7b-4ca0-b6d0-19d717ef2e06" already exists, skipping
I1127 19:08:43.692220 1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"kube-system", Name:"cephfs-pvc", UID:"7c47bda7-0c7b-4ca0-b6d0-19d717ef2e06", APIVersion:"v1", ResourceVersion:"7744", FieldPath:""}): type: 'Normal' reason: 'ProvisioningSucceeded' Successfully provisioned

In the pasted YAML for your StorageClass, you have:
reclaimPolicy: Deletea
Was that a paste issue? Regardless, this is likely what is causing your problem.
I just had this exact problem with some of my Ceph RBD volumes, and the reason for it was that I was using a StorageClass that had
reclaimPolicy: Delete
However, the cephcsi driver was not configured to support it (and I don't think it actually supports it either).
Using a StorageClass with
reclaimPolicy: Retain
fixed the issue.
To check this on your cluster, run the following:
$ kubectl get sc rook-cephfs -o yaml
And look for the line that starts with reclaimPolicy:
Then, look at the csidriver your StorageClass is using. In your case it is rook-ceph.cephfs.csi.ceph.com
$ kubectl get csidriver rook-ceph.cephfs.csi.ceph.com -o yaml
And look for the entries under volumeLifecycleModes
apiVersion: storage.k8s.io/v1beta1
kind: CSIDriver
metadata:
creationTimestamp: "2020-11-16T22:18:55Z"
name: rook-ceph.cephfs.csi.ceph.com
resourceVersion: "29863971"
selfLink: /apis/storage.k8s.io/v1beta1/csidrivers/rook-ceph.cephfs.csi.ceph.com
uid: a9651d30-935d-4a7d-a7c9-53d5bc90c28c
spec:
attachRequired: true
podInfoOnMount: false
volumeLifecycleModes:
- Persistent
If the only entry under volumeLifecycleModes is Persistent, then your driver is not configured to support reclaimPolicy: Delete.
If instead you see
volumeLifecycleModes:
- Persistent
- Ephemeral
Then your driver should support reclaimPolicy: Delete

Related

Deploy MongoDB in k8s for development purposes

I'm totally newbie in k8s, but read some basic about resourses in k8s and helm trying to create simple cluster in minikube:
Start minikube:
minikube start --cpus "4" --disk-size "40000mb"
Create namespace:
kubectl create namespace test
Using binami helm charts for mongo with custom values.yaml:
replicaCount: 1
architecture: standalone
persistence:
enabled: true
existingClaim: "test/mongodb-data"
nameOverride: test-mongodb
service:
nameOverride: test-mongodb
type: NodePort
nodePorts:
mongodb: 30000
namespaceOverride: test
auth:
rootUser: admin
rootPassword: root
usernames: ["admin"]
passwords: ["123"]
databases: ["test"]
Create volume.yaml for mongodb:
apiVersion: v1
kind: PersistentVolume
metadata:
name: mongodb-data
namespace: test
labels:
type: local
annotations:
volume.alpha.kubernetes.io/storage-class: standard
spec:
storageClassName: local-storage
capacity:
storage: 10Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Recycle
hostPath:
path: "/opt/mongodb-data"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mongodb-data
namespace: test
spec:
storageClassName: local-storage
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
Finally trying apply to cluster:
helm upgrade --install mongodb --namespace test --values ./k8s/backend/charts/mongodb/values.yaml bitnami/mongodb --wait --debug
As result in console see this:
history.go:56: [debug] getting history for release mongodb
Release "mongodb" does not exist. Installing it now.
install.go:192: [debug] Original chart version: ""
install.go:209: [debug] CHART PATH: C:\Temp\helm\repository\mongodb-13.6.4.tgz
client.go:128: [debug] creating 5 resource(s)
wait.go:66: [debug] beginning wait for 5 resources with timeout of 5m0s
ready.go:277: [debug] Deployment is not ready: test/mongodb-test-mongodb. 0 out of 1 expected pods are ready
...
Error: timed out waiting for the condition
helm.go:84: [debug] timed out waiting for the condition
When run see:
kubectl describe pods -n test mongodb-test-mongodb-7585fc9c48-rk45r
Name: mongodb-test-mongodb-7585fc9c48-rk45r
Namespace: test
Priority: 0
Node: <none>
Labels: app.kubernetes.io/component=mongodb
app.kubernetes.io/instance=mongodb
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=qabat-mongodb
helm.sh/chart=mongodb-13.6.4
pod-template-hash=7585fc9c48
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/mongodb-test-mongodb-7585fc9c48
Containers:
mongodb:
Image: docker.io/bitnami/mongodb:6.0.4-debian-11-r0
Port: 27017/TCP
Host Port: 0/TCP
Liveness: exec [/bitnami/scripts/ping-mongodb.sh] delay=30s timeout=10s period=20s #success=1 #failure=6
Readiness: exec [/bitnami/scripts/readiness-probe.sh] delay=5s timeout=5s period=10s #success=1 #failure=6
Environment:
BITNAMI_DEBUG: false
MONGODB_EXTRA_USERNAMES: admin
MONGODB_EXTRA_DATABASES: test
MONGODB_EXTRA_PASSWORDS: <set to the key 'mongodb-passwords' in secret 'mongodb-qabat-mongodb'> Optional: false
MONGODB_ROOT_USER: admin
MONGODB_ROOT_PASSWORD: <set to the key 'mongodb-root-password' in secret 'mongodb-qabat-mongodb'> Optional: false
ALLOW_EMPTY_PASSWORD: no
MONGODB_SYSTEM_LOG_VERBOSITY: 0
MONGODB_DISABLE_SYSTEM_LOG: no
MONGODB_DISABLE_JAVASCRIPT: no
MONGODB_ENABLE_JOURNAL: yes
MONGODB_PORT_NUMBER: 27017
MONGODB_ENABLE_IPV6: no
MONGODB_ENABLE_DIRECTORY_PER_DB: no
Mounts:
/bitnami/mongodb from datadir (rw)
/bitnami/scripts from common-scripts (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-xgpl5 (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
common-scripts:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: mongodb-test-mongodb-common-scripts
Optional: false
datadir:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: test/mongodb-data
ReadOnly: false
kube-api-access-xgpl5:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 2m11s (x3 over 12m) default-scheduler 0/1 nodes are available: 1 persistentvolumeclaim "test/mongodb-data" not found. preemption: 0/1 nodes are availab
le: 1 Preemption is not helpful for scheduling.
What's wrong and how it fix? I try same flow with rabbitmq helm, but same result...

Kubernetes use the same volumeMount in initContainer and Container

I am trying to get a volume mounted as a non-root user in one of my containers. I'm trying an approach from this SO post using an initContainer to set the correct user, but when I try to start the configuration I get an "unbound immediate PersistentVolumneClaims" error. I suspect it's because the volume is mounted in both my initContainer and container, but I'm not sure why that would be the issue: I can see the initContainer taking the claim, but I would have thought when it exited that it would release it, letting the normal container take the claim. Any ideas or alternatives to getting the directory mounted as a non-root user? I did try using securityContext/fsGroup, but that seemed to have no effect. The /var/rdf4j directory below is the one that is being mounted as root.
Configuration:
apiVersion: v1
kind: PersistentVolume
metadata:
name: triplestore-data-storage-dir
labels:
type: local
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteMany
storageClassName: local-storage
volumeMode: Filesystem
persistentVolumeReclaimPolicy: Delete
hostPath:
path: /run/desktop/mnt/host/d/workdir/k8s-data/triplestore
type: DirectoryOrCreate
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: triplestore-data-storage
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
storageClassName: local-storage
volumeName: "triplestore-data-storage-dir"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: triplestore
labels:
app: demo
role: triplestore
spec:
selector:
matchLabels:
app: demo
role: triplestore
replicas: 1
template:
metadata:
labels:
app: demo
role: triplestore
spec:
containers:
- name: triplestore
image: eclipse/rdf4j-workbench:amd64-3.5.0
imagePullPolicy: Always
ports:
- name: http
protocol: TCP
containerPort: 8080
resources:
requests:
cpu: 100m
memory: 200Mi
volumeMounts:
- name: storage
mountPath: /var/rdf4j
initContainers:
- name: take-data-dir-ownership
image: eclipse/rdf4j-workbench:amd64-3.5.0
command:
- chown
- -R
- 100:65533
- /var/rdf4j
volumeMounts:
- name: storage
mountPath: /var/rdf4j
volumes:
- name: storage
persistentVolumeClaim:
claimName: "triplestore-data-storage"
kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
triplestore-data-storage Bound triplestore-data-storage-dir 10Gi RWX local-storage 13s
kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
triplestore-data-storage-dir 10Gi RWX Delete Bound default/triplestore-data-storage local-storage 17s
kubectl get events
LAST SEEN TYPE REASON OBJECT MESSAGE
21s Warning FailedScheduling pod/triplestore-6d6876f49-2s84c 0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.
19s Normal Scheduled pod/triplestore-6d6876f49-2s84c Successfully assigned default/triplestore-6d6876f49-2s84c to docker-desktop
3s Normal Pulled pod/triplestore-6d6876f49-2s84c Container image "eclipse/rdf4j-workbench:amd64-3.5.0" already present on machine
3s Normal Created pod/triplestore-6d6876f49-2s84c Created container take-data-dir-ownership
3s Normal Started pod/triplestore-6d6876f49-2s84c Started container take-data-dir-ownership
2s Warning BackOff pod/triplestore-6d6876f49-2s84c Back-off restarting failed container
46m Normal Pulled pod/triplestore-6d6876f49-9n5kt Container image "eclipse/rdf4j-workbench:amd64-3.5.0" already present on machine
79s Warning BackOff pod/triplestore-6d6876f49-9n5kt Back-off restarting failed container
21s Normal SuccessfulCreate replicaset/triplestore-6d6876f49 Created pod: triplestore-6d6876f49-2s84c
21s Normal ScalingReplicaSet deployment/triplestore Scaled up replica set triplestore-6d6876f49 to 1
kubectl describe pods/triplestore-6d6876f49-tw8r8
Name: triplestore-6d6876f49-tw8r8
Namespace: default
Priority: 0
Node: docker-desktop/192.168.65.4
Start Time: Mon, 17 Jan 2022 10:17:20 -0500
Labels: app=demo
pod-template-hash=6d6876f49
role=triplestore
Annotations: <none>
Status: Pending
IP: 10.1.2.133
IPs:
IP: 10.1.2.133
Controlled By: ReplicaSet/triplestore-6d6876f49
Init Containers:
take-data-dir-ownership:
Container ID: docker://89e7b1e3ae76c30180ee5083624e1bf5f30b55fd95bf1c24422fabe41ae74408
Image: eclipse/rdf4j-workbench:amd64-3.5.0
Image ID: docker-pullable://registry.com/publicrepos/docker_cache/eclipse/rdf4j-workbench#sha256:14621ad610b0d0269dedd9939ea535348cc6c147f9bd47ba2039488b456118ed
Port: <none>
Host Port: <none>
Command:
chown
-R
100:65533
/var/rdf4j
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Mon, 17 Jan 2022 10:22:59 -0500
Finished: Mon, 17 Jan 2022 10:22:59 -0500
Ready: False
Restart Count: 6
Environment: <none>
Mounts:
/var/rdf4j from storage (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-s8wdv (ro)
Containers:
triplestore:
Container ID:
Image: eclipse/rdf4j-workbench:amd64-3.5.0
Image ID:
Port: 8080/TCP
Host Port: 0/TCP
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Requests:
cpu: 100m
memory: 200Mi
Environment: <none>
Mounts:
/var/rdf4j from storage (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-s8wdv (ro)
Conditions:
Type Status
Initialized False
Ready False
ContainersReady False
PodScheduled True
Volumes:
storage:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: triplestore-data-storage
ReadOnly: false
kube-api-access-s8wdv:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 6m24s default-scheduler 0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.
Normal Scheduled 6m13s default-scheduler Successfully assigned default/triplestore-6d6876f49-tw8r8 to docker-desktop
Normal Pulled 4m42s (x5 over 6m12s) kubelet Container image "eclipse/rdf4j-workbench:amd64-3.5.0" already present on machine
Normal Created 4m42s (x5 over 6m12s) kubelet Created container take-data-dir-ownership
Normal Started 4m42s (x5 over 6m12s) kubelet Started container take-data-dir-ownership
Warning BackOff 70s (x26 over 6m10s) kubelet Back-off restarting failed container
Solution
As it turns out the problem was that the initContainer wasn't running as root, it was running as the default user of the container, and so didn't have the permissions to run the chown command. In the linked SO comment, this was the first comment to the answer, with the response being that the initContainer ran as root - this has apparently changed in newer versions of kubernetes. There is a solution though, you can set the securityContext on the container to run as root, giving it permission to run the chown command, and that successfully allows the volume to be mounted as a non-root user. Here's the final configuration of the initContainer.
initContainers:
- name: take-data-dir-ownership
image: eclipse/rdf4j-workbench:amd64-3.5.0
securityContext:
runAsUser: 0
command:
- chown
- -R
- 100:65533
- /var/rdf4j
volumeMounts:
- name: storage
mountPath: /var/rdf4j
1 pod has unbound immediate PersistentVolumeClaims. - this error means the pod cannot bound to the PVC on the node where it has been scheduled to run on. This can happen when the PVC bounded to a PV that refers to a location that is not valid on the node that the pod is scheduled to run on. It will be helpful if you can post the complete output of kubectl get nodes -o wide, kubectl describe pvc triplestore-data-storage, kubectl describe pv triplestore-data-storage-dir to the question.
The mean time, PVC/PV is optional when using hostPath, can you try the following spec and see if the pod can come online:
apiVersion: apps/v1
kind: Deployment
metadata:
name: triplestore
labels:
app: demo
role: triplestore
spec:
selector:
matchLabels:
app: demo
role: triplestore
replicas: 1
template:
metadata:
labels:
app: demo
role: triplestore
spec:
containers:
- name: triplestore
image: eclipse/rdf4j-workbench:amd64-3.5.0
imagePullPolicy: IfNotPresent
ports:
- name: http
protocol: TCP
containerPort: 8080
resources:
requests:
cpu: 100m
memory: 200Mi
volumeMounts:
- name: storage
mountPath: /var/rdf4j
initContainers:
- name: take-data-dir-ownership
image: eclipse/rdf4j-workbench:amd64-3.5.0
imagePullPolicy: IfNotPresent
securityContext:
runAsUser: 0
command:
- chown
- -R
- 100:65533
- /var/rdf4j
volumeMounts:
- name: storage
mountPath: /var/rdf4j
volumes:
- name: storage
hostPath:
path: /run/desktop/mnt/host/d/workdir/k8s-data/triplestore
type: DirectoryOrCreate

Failed to bind to volume when installing RabbitMQ on K8S

I'm trying to install rabbit-mq using helm, but installation fails because of volume issues.
This is my storage class:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: Immediate
This is my persistent volume:
apiVersion: v1
kind: PersistentVolume
metadata:
name: main-pv
spec:
capacity:
storage: 100Gi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Delete
storageClassName: local-storage
local:
path: /media/2TB-DATA/k8s-pv
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- node-dev
This is the output to list my storage and pv:
# kubectl get storageclass
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
local-storage (default) kubernetes.io/no-provisioner Delete Immediate false 14m
# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
main-pv 100Gi RWX Delete Available local-storage 40m
After I install rabbit-mq:
helm install rabbitmq bitnami/rabbitmq
The pod is in Pending state, and I see this error:
# kubectl describe pvc
Name: data-rabbitmq-0
Namespace: default
StorageClass:
Status: Pending
Volume:
Labels: app.kubernetes.io/instance=rabbitmq
app.kubernetes.io/name=rabbitmq
Annotations: <none>
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode: Filesystem
Mounted By: rabbitmq-0
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal FailedBinding 3m20s (x4363 over 18h) persistentvolume-controller no persistent volumes available for this claim and no storage class is set
What am I doing wrong?
Maybe platform related. Where did you try to do that? Im asking cause just cant reproduce on GKE - it works fine
Cluster version, labels, nodes
kubectl get nodes --show-labels
NAME STATUS ROLES AGE VERSION LABELS
gke-cluster-1-default-pool-82008fd9-8x81 Ready <none> 96d v1.14.10-gke.36 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/fluentd-ds-ready=true,beta.kubernetes.io/instance-type=n1-standard-1,beta.kubernetes.io/os=linux,cloud.google.com/gke-nodepool=default-pool,cloud.google.com/gke-os-distribution=cos,failure-domain.beta.kubernetes.io/region=europe-west4,failure-domain.beta.kubernetes.io/zone=europe-west4-b,kubernetes.io/arch=amd64,kubernetes.io/hostname=gke-cluster-1-default-pool-82008fd9-8x81,kubernetes.io/os=linux,test=node
gke-cluster-1-default-pool-82008fd9-qkp7 Ready <none> 96d v1.14.10-gke.36 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/fluentd-ds-ready=true,beta.kubernetes.io/instance-type=n1-standard-1,beta.kubernetes.io/os=linux,cloud.google.com/gke-nodepool=default-pool,cloud.google.com/gke-os-distribution=cos,failure-domain.beta.kubernetes.io/region=europe-west4,failure-domain.beta.kubernetes.io/zone=europe-west4-b,kubernetes.io/arch=amd64,kubernetes.io/hostname=gke-cluster-1-default-pool-82008fd9-qkp7,kubernetes.io/os=linux,test=node
gke-cluster-1-default-pool-82008fd9-tlc7 Ready <none> 96d v1.14.10-gke.36 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/fluentd-ds-ready=true,beta.kubernetes.io/instance-type=n1-standard-1,beta.kubernetes.io/os=linux,cloud.google.com/gke-nodepool=default-pool,cloud.google.com/gke-os-distribution=cos,failure-domain.beta.kubernetes.io/region=europe-west4,failure-domain.beta.kubernetes.io/zone=europe-west4-b,kubernetes.io/arch=amd64,kubernetes.io/hostname=gke-cluster-1-default-pool-82008fd9-tlc7,kubernetes.io/os=linux,test=node
PV, Storageclass
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: Immediate
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: main-pv
spec:
capacity:
storage: 10Gi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Delete
storageClassName: local-storage
local:
path: /mnt/data
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: test
operator: In
values:
- node-test
Installing chart:
helm install rabbitmq bitnami/rabbitmq
...
kubectl get pods
NAME READY STATUS RESTARTS AGE
...
pod/rabbitmq-0 1/1 Running 0 3m40s
...
kubectl describe pod rabbitmq-0
Name: rabbitmq-0
Namespace: default
Priority: 0
Node: gke-cluster-1-default-pool-82008fd9-tlc7/10.164.0.29
Start Time: Thu, 03 Sep 2020 07:34:10 +0000
Labels: app.kubernetes.io/instance=rabbitmq
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=rabbitmq
controller-revision-hash=rabbitmq-8687f4cb9f
helm.sh/chart=rabbitmq-7.6.4
statefulset.kubernetes.io/pod-name=rabbitmq-0
Annotations: checksum/secret: 433e8ea7590e8d9f1bb94ed2f55e6d9b95f8abef722a917b97a9e916921d7ac5
kubernetes.io/limit-ranger: LimitRanger plugin set: cpu request for container rabbitmq
Status: Running
IP: 10.16.2.13
IPs: <none>
Controlled By: StatefulSet/rabbitmq
Containers:
rabbitmq:
Container ID: docker://b1a567522f50ac4c0663db2d9eca5fd8721d9a3d900ac38bb58f0cae038162f2
Image: docker.io/bitnami/rabbitmq:3.8.7-debian-10-r0
Image ID: docker-pullable://bitnami/rabbitmq#sha256:9abd53aeef6d222fec318c97a75dd50ce19c16b11cb83a3e4fb91c4047ea0d4d
Ports: 5672/TCP, 25672/TCP, 15672/TCP, 4369/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP
State: Running
Started: Thu, 03 Sep 2020 07:34:34 +0000
Ready: True
Restart Count: 0
Requests:
cpu: 100m
Liveness: exec [/bin/bash -ec rabbitmq-diagnostics -q check_running] delay=120s timeout=20s period=30s #success=1 #failure=6
Readiness: exec [/bin/bash -ec rabbitmq-diagnostics -q check_running] delay=10s timeout=20s period=30s #success=1 #failure=3
Environment:
BITNAMI_DEBUG: false
MY_POD_IP: (v1:status.podIP)
MY_POD_NAME: rabbitmq-0 (v1:metadata.name)
MY_POD_NAMESPACE: default (v1:metadata.namespace)
K8S_SERVICE_NAME: rabbitmq-headless
K8S_ADDRESS_TYPE: hostname
RABBITMQ_FORCE_BOOT: no
RABBITMQ_NODE_NAME: rabbit#$(MY_POD_NAME).$(K8S_SERVICE_NAME).$(MY_POD_NAMESPACE).svc.cluster.local
K8S_HOSTNAME_SUFFIX: .$(K8S_SERVICE_NAME).$(MY_POD_NAMESPACE).svc.cluster.local
RABBITMQ_MNESIA_DIR: /bitnami/rabbitmq/mnesia/$(RABBITMQ_NODE_NAME)
RABBITMQ_LDAP_ENABLE: no
RABBITMQ_LOGS: -
RABBITMQ_ULIMIT_NOFILES: 65536
RABBITMQ_USE_LONGNAME: true
RABBITMQ_ERL_COOKIE: <set to the key 'rabbitmq-erlang-cookie' in secret 'rabbitmq'> Optional: false
RABBITMQ_USERNAME: user
RABBITMQ_PASSWORD: <set to the key 'rabbitmq-password' in secret 'rabbitmq'> Optional: false
RABBITMQ_PLUGINS: rabbitmq_management, rabbitmq_peer_discovery_k8s, rabbitmq_auth_backend_ldap
Mounts:
/bitnami/rabbitmq/conf from configuration (rw)
/bitnami/rabbitmq/mnesia from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from rabbitmq-token-mclhw (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: data-rabbitmq-0
ReadOnly: false
configuration:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: rabbitmq-config
Optional: false
rabbitmq-token-mclhw:
Type: Secret (a volume populated by a Secret)
SecretName: rabbitmq-token-mclhw
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 6m42s default-scheduler Successfully assigned default/rabbitmq-0 to gke-cluster-1-default-pool-82008fd9-tlc7
Normal SuccessfulAttachVolume 6m36s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-8145821b-ed09-11ea-b464-42010aa400e3"
Normal Pulling 6m32s kubelet, gke-cluster-1-default-pool-82008fd9-tlc7 Pulling image "docker.io/bitnami/rabbitmq:3.8.7-debian-10-r0"
Normal Pulled 6m22s kubelet, gke-cluster-1-default-pool-82008fd9-tlc7 Successfully pulled image "docker.io/bitnami/rabbitmq:3.8.7-debian-10-r0"
Normal Created 6m18s kubelet, gke-cluster-1-default-pool-82008fd9-tlc7 Created container rabbitmq
Normal Started 6m18s kubelet, gke-cluster-1-default-pool-82008fd9-tlc7 Started container rabbitmq

k8s - Cinder "0/x nodes are available: x node(s) had volume node affinity conflict"

I have my own cluster k8s. I'm trying to link the cluster to openstack / cinder.
When I'm creating a PVC, I can a see a PV in k8s and the volume in Openstack.
But when I'm linking a pod with the PVC, I have the message k8s - Cinder "0/x nodes are available: x node(s) had volume node affinity conflict".
My yml test:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: classic
provisioner: kubernetes.io/cinder
parameters:
type: classic
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pvc-infra-consuldata4
namespace: infra
spec:
storageClassName: classic
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: consul
namespace: infra
labels:
app: consul
spec:
replicas: 1
selector:
matchLabels:
app: consul
template:
metadata:
labels:
app: consul
spec:
containers:
- name: consul
image: consul:1.4.3
volumeMounts:
- name: data
mountPath: /consul
resources:
requests:
cpu: 100m
limits:
cpu: 500m
command: ["consul", "agent", "-server", "-bootstrap", "-ui", "-bind", "0.0.0.0", "-client", "0.0.0.0", "-data-dir", "/consul"]
volumes:
- name: data
persistentVolumeClaim:
claimName: pvc-infra-consuldata4
The result:
kpro describe pvc -n infra
Name: pvc-infra-consuldata4
Namespace: infra
StorageClass: classic
Status: Bound
Volume: pvc-76bfdaf1-40bb-11e9-98de-fa163e53311c
Labels:
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{},"name":"pvc-infra-consuldata4","namespace":"infra"},"spec":...
pv.kubernetes.io/bind-completed: yes
pv.kubernetes.io/bound-by-controller: yes
volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/cinder
Finalizers: [kubernetes.io/pvc-protection]
Capacity: 1Gi
Access Modes: RWO
VolumeMode: Filesystem
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ProvisioningSucceeded 61s persistentvolume-controller Successfully provisioned volume pvc-76bfdaf1-40bb-11e9-98de-fa163e53311c using kubernetes.io/cinder
Mounted By: consul-85684dd7fc-j84v7
kpro describe po -n infra consul-85684dd7fc-j84v7
Name: consul-85684dd7fc-j84v7
Namespace: infra
Priority: 0
PriorityClassName: <none>
Node: <none>
Labels: app=consul
pod-template-hash=85684dd7fc
Annotations: <none>
Status: Pending
IP:
Controlled By: ReplicaSet/consul-85684dd7fc
Containers:
consul:
Image: consul:1.4.3
Port: <none>
Host Port: <none>
Command:
consul
agent
-server
-bootstrap
-ui
-bind
0.0.0.0
-client
0.0.0.0
-data-dir
/consul
Limits:
cpu: 2
Requests:
cpu: 500m
Environment: <none>
Mounts:
/consul from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-nxchv (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: pvc-infra-consuldata4
ReadOnly: false
default-token-nxchv:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-nxchv
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 36s (x6 over 2m40s) default-scheduler 0/6 nodes are available: 6 node(s) had volume node affinity conflict.
Why K8s successful create the Cinder volume, but it can't schedule the pod ?
Try finding out the nodeAffinity of your persistent volume:
$ kubctl describe pv pvc-76bfdaf1-40bb-11e9-98de-fa163e53311c
Node Affinity:
Required Terms:
Term 0: kubernetes.io/hostname in [xxx]
Then try to figure out if xxx matches the node label yyy that your pod is supposed to run on:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
yyy Ready worker 8d v1.15.3
If they don't match, you will have the "x node(s) had volume node affinity conflict" error, and you need to re-create the persistent volume with the correct nodeAffinity configuration.
I also ran into this issue when I forgot to deploy the EBS CSI driver before I tried to get my pod to connect to it.
kubectl apply -k "github.com/kubernetes-sigs/aws-ebs-csi-driver/deploy/kubernetes/overlays/stable/?ref=master"

Kubernetes pod pending when a new volume is attached (EKS)

Let me describe my scenario:
TL;DR
When I create a deployment on Kubernetes with 1 attached volume, everything works perfectly. When I create the same deployment, but with a second volume attached (total: 2 volumes), the pod gets stuck on "Pending" with errors:
pod has unbound PersistentVolumeClaims (repeated 2 times)
0/2 nodes are available: 2 node(s) had no available volume zone.
Already checked that the volumes are created in the correct availability zones.
Detailed description
I have a cluster set up using Amazon EKS, with 2 nodes. I have the following default storage class:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: gp2
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
reclaimPolicy: Retain
mountOptions:
- debug
And I have a mongodb deployment which needs two volumes, one mounted on /data/db folder, and the other mounted in some random directory I need. Here is an minimal yaml used to create the three components (I commented some lines on purpose):
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
namespace: my-project
creationTimestamp: null
labels:
io.kompose.service: my-project-db-claim0
name: my-project-db-claim0
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
namespace: my-project
creationTimestamp: null
labels:
io.kompose.service: my-project-db-claim1
name: my-project-db-claim1
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
namespace: my-project
name: my-project-db
spec:
replicas: 1
strategy:
type: Recreate
template:
metadata:
labels:
name: my-db
spec:
containers:
- name: my-project-db-container
image: mongo
imagePullPolicy: Always
resources: {}
volumeMounts:
- mountPath: /my_dir
name: my-project-db-claim0
# - mountPath: /data/db
# name: my-project-db-claim1
ports:
- containerPort: 27017
restartPolicy: Always
volumes:
- name: my-project-db-claim0
persistentVolumeClaim:
claimName: my-project-db-claim0
# - name: my-project-db-claim1
# persistentVolumeClaim:
# claimName: my-project-db-claim1
That yaml works perfectly. The output for the volumes is:
$ kubectl describe pv
Name: pvc-307b755a-039e-11e9-b78d-0a68bcb24bc6
Labels: failure-domain.beta.kubernetes.io/region=us-east-1
failure-domain.beta.kubernetes.io/zone=us-east-1c
Annotations: kubernetes.io/createdby: aws-ebs-dynamic-provisioner
pv.kubernetes.io/bound-by-controller: yes
pv.kubernetes.io/provisioned-by: kubernetes.io/aws-ebs
Finalizers: [kubernetes.io/pv-protection]
StorageClass: gp2
Status: Bound
Claim: my-project/my-project-db-claim0
Reclaim Policy: Delete
Access Modes: RWO
Capacity: 5Gi
Node Affinity: <none>
Message:
Source:
Type: AWSElasticBlockStore (a Persistent Disk resource in AWS)
VolumeID: aws://us-east-1c/vol-xxxxx
FSType: ext4
Partition: 0
ReadOnly: false
Events: <none>
Name: pvc-308d8979-039e-11e9-b78d-0a68bcb24bc6
Labels: failure-domain.beta.kubernetes.io/region=us-east-1
failure-domain.beta.kubernetes.io/zone=us-east-1b
Annotations: kubernetes.io/createdby: aws-ebs-dynamic-provisioner
pv.kubernetes.io/bound-by-controller: yes
pv.kubernetes.io/provisioned-by: kubernetes.io/aws-ebs
Finalizers: [kubernetes.io/pv-protection]
StorageClass: gp2
Status: Bound
Claim: my-project/my-project-db-claim1
Reclaim Policy: Delete
Access Modes: RWO
Capacity: 10Gi
Node Affinity: <none>
Message:
Source:
Type: AWSElasticBlockStore (a Persistent Disk resource in AWS)
VolumeID: aws://us-east-1b/vol-xxxxx
FSType: ext4
Partition: 0
ReadOnly: false
Events: <none>
And the pod output:
$ kubectl describe pods
Name: my-project-db-7d48567b48-slncd
Namespace: my-project
Priority: 0
PriorityClassName: <none>
Node: ip-192-168-212-194.ec2.internal/192.168.212.194
Start Time: Wed, 19 Dec 2018 15:55:58 +0100
Labels: name=my-db
pod-template-hash=3804123604
Annotations: <none>
Status: Running
IP: 192.168.216.33
Controlled By: ReplicaSet/my-project-db-7d48567b48
Containers:
my-project-db-container:
Container ID: docker://cf8222f15e395b02805c628b6addde2d77de2245aed9406a48c7c6f4dccefd4e
Image: mongo
Image ID: docker-pullable://mongo#sha256:0823cc2000223420f88b20d5e19e6bc252fa328c30d8261070e4645b02183c6a
Port: 27017/TCP
Host Port: 0/TCP
State: Running
Started: Wed, 19 Dec 2018 15:56:42 +0100
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/my_dir from my-project-db-claim0 (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-pf9ks (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
my-project-db-claim0:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: my-project-db-claim0
ReadOnly: false
default-token-pf9ks:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-pf9ks
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 7m22s (x5 over 7m23s) default-scheduler pod has unbound PersistentVolumeClaims (repeated 2 times)
Normal Scheduled 7m21s default-scheduler Successfully assigned my-project/my-project-db-7d48567b48-slncd to ip-192-168-212-194.ec2.internal
Normal SuccessfulMountVolume 7m21s kubelet, ip-192-168-212-194.ec2.internal MountVolume.SetUp succeeded for volume "default-token-pf9ks"
Warning FailedAttachVolume 7m13s (x5 over 7m21s) attachdetach-controller AttachVolume.Attach failed for volume "pvc-307b755a-039e-11e9-b78d-0a68bcb24bc6" : "Error attaching EBS volume \"vol-01a863d0aa7c7e342\"" to instance "i-0a7dafbbdfeabc50b" since volume is in "creating" state
Normal SuccessfulAttachVolume 7m1s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-307b755a-039e-11e9-b78d-0a68bcb24bc6"
Normal SuccessfulMountVolume 6m48s kubelet, ip-192-168-212-194.ec2.internal MountVolume.SetUp succeeded for volume "pvc-307b755a-039e-11e9-b78d-0a68bcb24bc6"
Normal Pulling 6m48s kubelet, ip-192-168-212-194.ec2.internal pulling image "mongo"
Normal Pulled 6m39s kubelet, ip-192-168-212-194.ec2.internal Successfully pulled image "mongo"
Normal Created 6m38s kubelet, ip-192-168-212-194.ec2.internal Created container
Normal Started 6m37s kubelet, ip-192-168-212-194.ec2.internal Started container
Everything is created without any problems. But if I uncomment the lines in the yaml so two volumes are attached to the db deployment, the pv output is the same as earlier, but the pod gets stuck on pending with the following output:
$ kubectl describe pods
Name: my-project-db-b8b8d8bcb-l64d7
Namespace: my-project
Priority: 0
PriorityClassName: <none>
Node: <none>
Labels: name=my-db
pod-template-hash=646484676
Annotations: <none>
Status: Pending
IP:
Controlled By: ReplicaSet/my-project-db-b8b8d8bcb
Containers:
my-project-db-container:
Image: mongo
Port: 27017/TCP
Host Port: 0/TCP
Environment: <none>
Mounts:
/data/db from my-project-db-claim1 (rw)
/my_dir from my-project-db-claim0 (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-pf9ks (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
my-project-db-claim0:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: my-project-db-claim0
ReadOnly: false
my-project-db-claim1:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: my-project-db-claim1
ReadOnly: false
default-token-pf9ks:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-pf9ks
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 60s (x5 over 60s) default-scheduler pod has unbound PersistentVolumeClaims (repeated 2 times)
Warning FailedScheduling 2s (x16 over 59s) default-scheduler 0/2 nodes are available: 2 node(s) had no available volume zone.
I've already read these two issues:
Dynamic volume provisioning creates EBS volume in the wrong availability zone
PersistentVolume on EBS can be created in availability zones with no nodes (Closed)
But I already checked that the volumes are created in the same zones as the cluster nodes instances. In fact, EKS creates two EBS by default in us-east-1b and us-east-1c zones and those volumes works. The volumes created by the posted yaml are on those regions too.
See this article: https://kubernetes.io/blog/2018/10/11/topology-aware-volume-provisioning-in-kubernetes/
The gist is that you want to update your storageclass to include:
volumeBindingMode: WaitForFirstConsumer
This causes the PV to not be created until the pod is scheduled. It fixed a similar problem for me.
Sounds like it's trying to create a volume in an availability zone where you don't have any volumes on. You can try restricting your StorageClass to the availability zones where you have nodes.
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: gp2
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
reclaimPolicy: Retain
mountOptions:
- debug
allowedTopologies:
- matchLabelExpressions:
- key: failure-domain.beta.kubernetes.io/zone
values:
- us-east-1b
- us-east-1c
This is very similar to this question and this answer except that the issue described is on GCP and in this case it's AWS.
In this case, you should check the availability zone of your worker nodes (EC2 instances).
As a Example :
worker node 1 = eu-central-1b
worker node 2 = eu-central-1c
Then create the volume in including one of an availability zone which mentioned above(do not create the volume with eu-central-1a).
after you create the volume, create your PersistentVolume and PersistentVolumeClaim by attaching a newly created volume to your cluster like below.
apiVersion: v1
kind: PersistentVolume
metadata:
labels:
failure-domain.beta.kubernetes.io/region: eu-central-1
failure-domain.beta.kubernetes.io/zone: eu-central-1b
name: mongo-pv
namespace: default
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 100Gi
awsElasticBlockStore:
fsType: ext4
volumeID: aws://eu-central-1b/vol-063342ab9be5d2929
storageClassName: gp2
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mongo-pvc
namespace: default
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: gp2
volumeName: mongo-pv