EKS: node(s) had volume node affinity conflict - kubernetes

I am trying to deploy a application on EKS cluster version 1.23. When I executed the files, my deployment stuck in pending state. I describe the pod and find below error.
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal NotTriggerScaleUp 2m55s cluster-autoscaler pod didn't trigger scale-up: 2 node(s) had volume node affinity conflict
Warning FailedScheduling 57s (x3 over 3m3s) default-scheduler 0/15 nodes are available: 1 node(s) were unschedulable, 14 node(s) had volume node affinity conflict.
I also followed Kubernetes Pod Warning: 1 node(s) had volume node affinity conflict this but still no luck.
The PVC file is:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: ankit-eks-discovery-pvc
namespace: ankit-eks
spec:
storageClassName: ankit-eks-discovery
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1024M
And the storage class used for that is:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: ankit-eks-discovery
namespace: ankit-eks
#volumeBindingMode: WaitForFirstConsumer
provisioner: kubernetes.io/aws-ebs
reclaimPolicy: Delete
allowVolumeExpansion: true
parameters:
fsType: ext4
type: gp2
allowedTopologies:
- matchLabelExpressions:
- key: failure-domain.beta.kubernetes.io/zone
values:
- us-east-1a
- us-east-1b
- us-east-1c
- us-east-1d
The PV description is:
Name: pvc-c9f6d0d3-0348-4ff4-8d9f-e01af1996e60
Labels: topology.kubernetes.io/region=us-east-1
topology.kubernetes.io/zone=us-east-1a
Annotations: pv.kubernetes.io/migrated-to: ebs.csi.aws.com
pv.kubernetes.io/provisioned-by: kubernetes.io/aws-ebs
volume.kubernetes.io/provisioner-deletion-secret-name:
volume.kubernetes.io/provisioner-deletion-secret-namespace:
Finalizers: [kubernetes.io/pv-protection]
StorageClass: ankit-eks-discovery
Status: Bound
Claim: ankit-eks/ankit-eks-discovery-pvc
Reclaim Policy: Delete
Access Modes: RWO
VolumeMode: Filesystem
Capacity: 1Gi
Node Affinity:
Required Terms:
Term 0: topology.kubernetes.io/zone in [us-east-1a]
topology.kubernetes.io/region in [us-east-1]
Message:
Source:
Type: AWSElasticBlockStore (a Persistent Disk resource in AWS)
VolumeID: vol-0eb1d80b2882356b2
FSType: ext4
Partition: 0
ReadOnly: false
Events: <none>
I tried deleting and recreating deployment, PVC and storage class but no luck.
I also checked the labels of my nodes. They are also correct.
[ankit#ankit]$ kubectl describe no ip-10-211-26-94.ec2.internal
Name: ip-10-211-26-94.ec2.internal
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/instance-type=m5d.large
beta.kubernetes.io/os=linux
eks.amazonaws.com/capacityType=ON_DEMAND
eks.amazonaws.com/nodegroup=ankit-nodes
eks.amazonaws.com/nodegroup-image=ami-0eb3216fe26784e21
failure-domain.beta.kubernetes.io/region=us-east-1
failure-domain.beta.kubernetes.io/zone=us-east-1b
k8s.io/cloud-provider-aws=b69ac44d98ef071c695017c202bde456
kubernetes.io/arch=amd64
kubernetes.io/hostname=ip-10-211-26-94.ec2.internal
kubernetes.io/os=linux
node.kubernetes.io/instance-type=m5d.large
topology.ebs.csi.aws.com/zone=us-east-1b
topology.kubernetes.io/region=us-east-1
topology.kubernetes.io/zone=us-east-1b
Annotations: alpha.kubernetes.io/provided-node-ip: 10.211.26.94
csi.volume.kubernetes.io/nodeid: {"ebs.csi.aws.com":"i-068a872a874b02642"}
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
I don't know what I am doing wrong. Can anyone help here.

Related

Pod does not find available persistent volume to bind

I want to deploy Traefik as a "Statefulset" to persist access and error logs on disk. Therefore I moved from the default "Deployment" to "Statefulset" and created a PV for each node. After defining the "volumeClaimTemplates" in the "Statefulset" the pod returns following error:
Warning FailedScheduling 98s default-scheduler 0/3 nodes are
available: 1 node(s) had untolerated taint
{node-role.kubernetes.io/control-plane: }, 3 node(s) didn't find
available persistent volumes to bind. preemption: 0/3 nodes are
available: 3 Preemption is not helpful for scheduling.
My PV + Storageclass definition:
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
{% for i in range(0,groups['k8s-workers'] | length ) %}
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: traefik-logs-{{ "%02d"|format(i+1) }}
spec:
persistentVolumeReclaimPolicy: Retain
capacity:
storage: 50Gi
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
local:
path: "/mnt/traefik"
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- k8s-worker-{{ "%02d"|format(i+1) }}
{% endfor %}
And finally the Statefulset:
---
kind: StatefulSet
apiVersion: apps/v1
metadata:
name: traefik-deployment
namespace: traefik
labels:
app: traefik
spec:
replicas: 2
serviceName: traefik
selector:
matchLabels:
app: traefik
template:
metadata:
labels:
app: traefik
spec:
serviceAccountName: traefik-account
terminationGracePeriodSeconds: 60
containers:
- name: traefik
image: traefik:v2.8
args:
- ...
ports:
- ...
volumeMounts:
- mountPath: "/var/log"
name: traefik-logs
volumeClaimTemplates:
- metadata:
name: traefik-logs
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 30Gi
storageClassName: local-storage
The PV's are created:
root#k8s-master-01:/opt/traefik# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
elasticsearch-data-master-pv-01 120Gi RWO Retain Bound elasticsearch/elasticsearch-data-elasticsearch-es-master-0 elastic-local-master-storage 4h19m
elasticsearch-data-worker-pv-01 120Gi RWO Retain Bound elasticsearch/elasticsearch-data-elasticsearch-es-worker-0 elastic-local-worker-storage 4h19m
elasticsearch-data-worker-pv-02 120Gi RWO Retain Bound elasticsearch/elasticsearch-data-elasticsearch-es-worker-1 elastic-local-worker-storage 4h19m
traefik-logs-01 50Gi RWO Retain Available 82m
traefik-logs-02 50Gi RWO Retain Available 82m
The PVC is stuck at pending:
root#k8s-master-01:/opt/traefik# kubectl get pvc -A
NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
elasticsearch elasticsearch-data-elasticsearch-es-master-0 Bound elasticsearch-data-master-pv-01 120Gi RWO elastic-local-master-storage 4h19m
elasticsearch elasticsearch-data-elasticsearch-es-worker-0 Bound elasticsearch-data-worker-pv-01 120Gi RWO elastic-local-worker-storage 4h19m
elasticsearch elasticsearch-data-elasticsearch-es-worker-1 Bound elasticsearch-data-worker-pv-02 120Gi RWO elastic-local-worker-storage 4h19m
traefik traefik-logs-traefik-deployment-0 Pending local-storage 56m
What have I misunderstood about provisioning persistent storage for a pod?

Cannot get Pod to bind local-storage in minikube. "node(s) didn't find available persistent volumes", "waiting for first consumer to be created"

I'm having some trouble configuring my Kubernetes deployment on minikube use local-storage. I'm trying to set up a rethinkdb instance that will mount a directory from the minikube VM to the rethinkdb Pod. My setup is the following
Storage
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: rethinkdb-pv
spec:
capacity:
storage: 10Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Delete
storageClassName: local-storage
local:
path: /mnt/data
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- minikube
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: rethinkdb-pv-claim
spec:
storageClassName: local-storage
accessModes:
- ReadWriteMany
resources:
requests:
storage: 5Gi
So I define a storageClass of local-storage type as described online in the tutorials. I then make a PersistentVolume that asks for 10GB of storage from the /mnt/data path on the underlying host. I have made this directory on the minikube VM
$ minikube ssh
$ ls /mnt
data sda1
This PersistentVolume has the storage class of local-storage and requests it from nodes matching the nodeAffinity section of hostname in 'minikube'.
I then make a PersistentVolumeClaim that asks for the type local-storage and requests 5GB.
Everything is good here, right? Here is the output of kubectl
$ kubectl get pv,pvc,storageClass
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/rethinkdb-pv 10Gi RWO Delete Available local-storage 9m33s
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/rethinkdb-pv-claim Pending local-storage 7m51s
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
storageclass.storage.k8s.io/local-storage kubernetes.io/no-provisioner Delete WaitForFirstConsumer false 9m33s
storageclass.storage.k8s.io/standard (default) k8s.io/minikube-hostpath Delete Immediate false 24h
RethinkDB Deployment
I now attempt to make a Deployment with a single replica of the standard RethinkDB container.
apiVersion: apps/v1
kind: Deployment
metadata:
creationTimestamp: null
labels:
name: database
name: rethinkdb
spec:
progressDeadlineSeconds: 2147483647
replicas: 1
selector:
matchLabels:
service: rethinkdb
template:
metadata:
creationTimestamp: null
labels:
service: rethinkdb
spec:
containers:
- name: rethinkdb
image: rethinkdb:latest
volumeMounts:
- mountPath: /data
name: rethinkdb-data
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- name: rethinkdb-data
persistentVolumeClaim:
claimName: rethinkdb-pv-claim
This asks for a single replica of rethinkdb and it tries to mount the rethinkdb-pv-claim Persistent Volume Claim as the name rethinkdb-data and then attempts to mount that at /data in the container.
This is what shows, though
Name: rethinkdb-6dbf4ccdb-64gk5
Namespace: development
Priority: 0
Node: <none>
Labels: pod-template-hash=6dbf4ccdb
service=rethinkdb
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/rethinkdb-6dbf4ccdb
Containers:
rethinkdb:
Image: rethinkdb:latest
Port: <none>
Host Port: <none>
Environment: <none>
Mounts:
/data from rethinkdb-data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-d5ncp (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
rethinkdb-data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: rethinkdb-pv-claim
ReadOnly: false
default-token-d5ncp:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-d5ncp
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 73s (x7 over 8m38s) default-scheduler 0/1 nodes are available: 1 node(s) didn't find available persistent volumes to bind.
"1 node(s) didn't find available persistent volumes to bind". I'm not sure how that is because the PVC is available.
$ kubectl describe pvc
Name: rethinkdb-pv-claim
Namespace: development
StorageClass: local-storage
Status: Pending
Volume:
Labels: <none>
Annotations: Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode: Filesystem
Mounted By: rethinkdb-6dbf4ccdb-64gk5
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal WaitForFirstConsumer 11s (x42 over 10m) persistentvolume-controller waiting for first consumer to be created before binding
I think one hint is that the field
Nodes <null> for the Pod - does that mean it isn't assigned to a node?
I think the issue is that one of mine was ReadWriteOnce and the other one was ReadWriteMany, then I had trouble getting permissions right when running minikube mount /tmp/data:/mnt/data so I just got rid of mounting it to the underlying filesystem and now it works

k8s - Cinder "0/x nodes are available: x node(s) had volume node affinity conflict"

I have my own cluster k8s. I'm trying to link the cluster to openstack / cinder.
When I'm creating a PVC, I can a see a PV in k8s and the volume in Openstack.
But when I'm linking a pod with the PVC, I have the message k8s - Cinder "0/x nodes are available: x node(s) had volume node affinity conflict".
My yml test:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: classic
provisioner: kubernetes.io/cinder
parameters:
type: classic
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pvc-infra-consuldata4
namespace: infra
spec:
storageClassName: classic
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: consul
namespace: infra
labels:
app: consul
spec:
replicas: 1
selector:
matchLabels:
app: consul
template:
metadata:
labels:
app: consul
spec:
containers:
- name: consul
image: consul:1.4.3
volumeMounts:
- name: data
mountPath: /consul
resources:
requests:
cpu: 100m
limits:
cpu: 500m
command: ["consul", "agent", "-server", "-bootstrap", "-ui", "-bind", "0.0.0.0", "-client", "0.0.0.0", "-data-dir", "/consul"]
volumes:
- name: data
persistentVolumeClaim:
claimName: pvc-infra-consuldata4
The result:
kpro describe pvc -n infra
Name: pvc-infra-consuldata4
Namespace: infra
StorageClass: classic
Status: Bound
Volume: pvc-76bfdaf1-40bb-11e9-98de-fa163e53311c
Labels:
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{},"name":"pvc-infra-consuldata4","namespace":"infra"},"spec":...
pv.kubernetes.io/bind-completed: yes
pv.kubernetes.io/bound-by-controller: yes
volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/cinder
Finalizers: [kubernetes.io/pvc-protection]
Capacity: 1Gi
Access Modes: RWO
VolumeMode: Filesystem
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ProvisioningSucceeded 61s persistentvolume-controller Successfully provisioned volume pvc-76bfdaf1-40bb-11e9-98de-fa163e53311c using kubernetes.io/cinder
Mounted By: consul-85684dd7fc-j84v7
kpro describe po -n infra consul-85684dd7fc-j84v7
Name: consul-85684dd7fc-j84v7
Namespace: infra
Priority: 0
PriorityClassName: <none>
Node: <none>
Labels: app=consul
pod-template-hash=85684dd7fc
Annotations: <none>
Status: Pending
IP:
Controlled By: ReplicaSet/consul-85684dd7fc
Containers:
consul:
Image: consul:1.4.3
Port: <none>
Host Port: <none>
Command:
consul
agent
-server
-bootstrap
-ui
-bind
0.0.0.0
-client
0.0.0.0
-data-dir
/consul
Limits:
cpu: 2
Requests:
cpu: 500m
Environment: <none>
Mounts:
/consul from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-nxchv (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: pvc-infra-consuldata4
ReadOnly: false
default-token-nxchv:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-nxchv
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 36s (x6 over 2m40s) default-scheduler 0/6 nodes are available: 6 node(s) had volume node affinity conflict.
Why K8s successful create the Cinder volume, but it can't schedule the pod ?
Try finding out the nodeAffinity of your persistent volume:
$ kubctl describe pv pvc-76bfdaf1-40bb-11e9-98de-fa163e53311c
Node Affinity:
Required Terms:
Term 0: kubernetes.io/hostname in [xxx]
Then try to figure out if xxx matches the node label yyy that your pod is supposed to run on:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
yyy Ready worker 8d v1.15.3
If they don't match, you will have the "x node(s) had volume node affinity conflict" error, and you need to re-create the persistent volume with the correct nodeAffinity configuration.
I also ran into this issue when I forgot to deploy the EBS CSI driver before I tried to get my pod to connect to it.
kubectl apply -k "github.com/kubernetes-sigs/aws-ebs-csi-driver/deploy/kubernetes/overlays/stable/?ref=master"

JupyterHub Hub Pod stuck in pending - pod has unbound immediate PersistentVolumeClaims

I am having issues deploying juypterhub on kubernetes cluster. The issue I am getting is that the hub pod is stuck in pending.
Stack:
kubeadm
flannel
weave
helm
jupyterhub
Runbook:
$kubeadm init --pod-network-cidr="10.244.0.0/16"
$sudo cp /etc/kubernetes/admin.conf $HOME/ && sudo chown $(id -u):$(id -g) $HOME/admin.conf && export KUBECONFIG=$HOME/admin.conf
$kubectl create -f pvc.yml
$kubectl create -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel-aliyun.yml
$kubectl apply --filename https://git.io/weave-kube-1.6
$kubectl taint nodes --all node-role.kubernetes.io/master-
Helm installations as per https://zero-to-jupyterhub.readthedocs.io/en/latest/setup-helm.html
Jupyter installations as per https://zero-to-jupyterhub.readthedocs.io/en/latest/setup-jupyterhub.html
config.yml
proxy:
secretToken: "asdf"
singleuser:
storage:
dynamic:
storageClass: local-storage
pvc.yml
apiVersion: v1
kind: PersistentVolume
metadata:
name: standard
spec:
capacity:
storage: 100Gi
# volumeMode field requires BlockVolume Alpha feature gate to be enabled.
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Delete
storageClassName: local-storage
local:
path: /dev/vdb
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- example-node
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: standard
spec:
storageClassName: local-storage
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3Gi
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
The warning is:
$kubectl --namespace=jhub get pod
NAME READY STATUS RESTARTS AGE
hub-fb48dfc4f-mqf4c 0/1 Pending 0 3m33s
proxy-86977cf9f7-fqf8d 1/1 Running 0 3m33s
$kubectl --namespace=jhub describe pod hub
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 35s (x3 over 35s) default-scheduler pod has unbound immediate PersistentVolumeClaims
$kubectl --namespace=jhub describe pv
Name: standard
Labels: type=local
Annotations: pv.kubernetes.io/bound-by-controller: yes
Finalizers: [kubernetes.io/pv-protection]
StorageClass: manual
Status: Bound
Claim: default/standard
Reclaim Policy: Retain
Access Modes: RWO
VolumeMode: Filesystem
Capacity: 10Gi
Node Affinity: <none>
Message:
Source:
Type: HostPath (bare host directory volume)
Path: /dev/vdb
HostPathType:
Events: <none>
$kubectl --namespace=kube-system describe pvc
Name: hub-db-dir
Namespace: jhub
StorageClass:
Status: Pending
Volume:
Labels: app=jupyterhub
chart=jupyterhub-0.8.0-beta.1
component=hub
heritage=Tiller
release=jhub
Annotations: <none>
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode: Filesystem
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal FailedBinding 13s (x7 over 85s) persistentvolume-controller no persistent volumes available for this claim and no storage class is set
Mounted By: hub-fb48dfc4f-mqf4c
I tried my best to follow the localstorage volume configuration on the official kubernetes website, but with no luck
-G
Managed to fix it using the following configuration.
Key points:
- I forgot to add the node in nodeAffinity
- it works without putting in volumeBindingMode
apiVersion: v1
kind: PersistentVolume
metadata:
name: standard
spec:
capacity:
storage: 2Gi
# volumeMode field requires BlockVolume Alpha feature gate to be enabled.
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: local-storage
local:
path: /temp
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- INSERT_NODE_NAME_HERE
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
annotations:
storageclass.kubernetes.io/is-default-class: "true"
name: local-storage
provisioner: kubernetes.io/no-provisioner
config.yaml
proxy:
secretToken: "token"
singleuser:
storage:
dynamic:
storageClass: local-storage
make sure your storage/pv looks like this:
root#asdf:~# kubectl --namespace=kube-system describe pv
Name: standard
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"PersistentVolume","metadata":{"annotations":{},"name":"standard"},"spec":{"accessModes":["ReadWriteOnce"],"capa...
pv.kubernetes.io/bound-by-controller: yes
Finalizers: [kubernetes.io/pv-protection]
StorageClass: local-storage
Status: Bound
Claim: jhub/hub-db-dir
Reclaim Policy: Retain
Access Modes: RWO
VolumeMode: Filesystem
Capacity: 2Gi
Node Affinity:
Required Terms:
Term 0: kubernetes.io/hostname in [asdf]
Message:
Source:
Type: LocalVolume (a persistent volume backed by local storage on a node)
Path: /temp
Events: <none>
root#asdf:~# kubectl --namespace=kube-system describe storageclass
Name: local-storage
IsDefaultClass: Yes
Annotations: storageclass.kubernetes.io/is-default-class=true
Provisioner: kubernetes.io/no-provisioner
Parameters: <none>
AllowVolumeExpansion: <unset>
MountOptions: <none>
ReclaimPolicy: Delete
VolumeBindingMode: Immediate
Events: <none>
Now the hub pod looks something like this:
root#asdf:~# kubectl --namespace=jhub describe pod hub
Name: hub-5d4fcd8fd9-p6crs
Namespace: jhub
Priority: 0
PriorityClassName: <none>
Node: asdf/192.168.0.87
Start Time: Sat, 23 Feb 2019 14:29:51 +0800
Labels: app=jupyterhub
component=hub
hub.jupyter.org/network-access-proxy-api=true
hub.jupyter.org/network-access-proxy-http=true
hub.jupyter.org/network-access-singleuser=true
pod-template-hash=5d4fcd8fd9
release=jhub
Annotations: checksum/config-map: --omitted
checksum/secret: --omitted--
Status: Running
IP: 10.244.0.55
Controlled By: ReplicaSet/hub-5d4fcd8fd9
Containers:
hub:
Container ID: docker://d2d4dec8cc16fe21589e67f1c0c6c6114b59b01c67a9f06391830a1ea711879d
Image: jupyterhub/k8s-hub:0.8.0
Image ID: docker-pullable://jupyterhub/k8s-hub#sha256:e40cfda4f305af1a2fdf759cd0dcda834944bef0095c8b5ecb7734d19f58b512
Port: 8081/TCP
Host Port: 0/TCP
Command:
jupyterhub
--config
/srv/jupyterhub_config.py
--upgrade-db
State: Running
Started: Sat, 23 Feb 2019 14:30:28 +0800
Ready: True
Restart Count: 0
Requests:
cpu: 200m
memory: 512Mi
Environment:
PYTHONUNBUFFERED: 1
HELM_RELEASE_NAME: jhub
POD_NAMESPACE: jhub (v1:metadata.namespace)
CONFIGPROXY_AUTH_TOKEN: <set to the key 'proxy.token' in secret 'hub-secret'> Optional: false
Mounts:
/etc/jupyterhub/config/ from config (rw)
/etc/jupyterhub/secret/ from secret (rw)
/srv/jupyterhub from hub-db-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from hub-token-bxzl7 (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: hub-config
Optional: false
secret:
Type: Secret (a volume populated by a Secret)
SecretName: hub-secret
Optional: false
hub-db-dir:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: hub-db-dir
ReadOnly: false
hub-token-bxzl7:
Type: Secret (a volume populated by a Secret)
SecretName: hub-token-bxzl7
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events: <none>

Kubernetes pod pending when a new volume is attached (EKS)

Let me describe my scenario:
TL;DR
When I create a deployment on Kubernetes with 1 attached volume, everything works perfectly. When I create the same deployment, but with a second volume attached (total: 2 volumes), the pod gets stuck on "Pending" with errors:
pod has unbound PersistentVolumeClaims (repeated 2 times)
0/2 nodes are available: 2 node(s) had no available volume zone.
Already checked that the volumes are created in the correct availability zones.
Detailed description
I have a cluster set up using Amazon EKS, with 2 nodes. I have the following default storage class:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: gp2
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
reclaimPolicy: Retain
mountOptions:
- debug
And I have a mongodb deployment which needs two volumes, one mounted on /data/db folder, and the other mounted in some random directory I need. Here is an minimal yaml used to create the three components (I commented some lines on purpose):
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
namespace: my-project
creationTimestamp: null
labels:
io.kompose.service: my-project-db-claim0
name: my-project-db-claim0
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
namespace: my-project
creationTimestamp: null
labels:
io.kompose.service: my-project-db-claim1
name: my-project-db-claim1
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
namespace: my-project
name: my-project-db
spec:
replicas: 1
strategy:
type: Recreate
template:
metadata:
labels:
name: my-db
spec:
containers:
- name: my-project-db-container
image: mongo
imagePullPolicy: Always
resources: {}
volumeMounts:
- mountPath: /my_dir
name: my-project-db-claim0
# - mountPath: /data/db
# name: my-project-db-claim1
ports:
- containerPort: 27017
restartPolicy: Always
volumes:
- name: my-project-db-claim0
persistentVolumeClaim:
claimName: my-project-db-claim0
# - name: my-project-db-claim1
# persistentVolumeClaim:
# claimName: my-project-db-claim1
That yaml works perfectly. The output for the volumes is:
$ kubectl describe pv
Name: pvc-307b755a-039e-11e9-b78d-0a68bcb24bc6
Labels: failure-domain.beta.kubernetes.io/region=us-east-1
failure-domain.beta.kubernetes.io/zone=us-east-1c
Annotations: kubernetes.io/createdby: aws-ebs-dynamic-provisioner
pv.kubernetes.io/bound-by-controller: yes
pv.kubernetes.io/provisioned-by: kubernetes.io/aws-ebs
Finalizers: [kubernetes.io/pv-protection]
StorageClass: gp2
Status: Bound
Claim: my-project/my-project-db-claim0
Reclaim Policy: Delete
Access Modes: RWO
Capacity: 5Gi
Node Affinity: <none>
Message:
Source:
Type: AWSElasticBlockStore (a Persistent Disk resource in AWS)
VolumeID: aws://us-east-1c/vol-xxxxx
FSType: ext4
Partition: 0
ReadOnly: false
Events: <none>
Name: pvc-308d8979-039e-11e9-b78d-0a68bcb24bc6
Labels: failure-domain.beta.kubernetes.io/region=us-east-1
failure-domain.beta.kubernetes.io/zone=us-east-1b
Annotations: kubernetes.io/createdby: aws-ebs-dynamic-provisioner
pv.kubernetes.io/bound-by-controller: yes
pv.kubernetes.io/provisioned-by: kubernetes.io/aws-ebs
Finalizers: [kubernetes.io/pv-protection]
StorageClass: gp2
Status: Bound
Claim: my-project/my-project-db-claim1
Reclaim Policy: Delete
Access Modes: RWO
Capacity: 10Gi
Node Affinity: <none>
Message:
Source:
Type: AWSElasticBlockStore (a Persistent Disk resource in AWS)
VolumeID: aws://us-east-1b/vol-xxxxx
FSType: ext4
Partition: 0
ReadOnly: false
Events: <none>
And the pod output:
$ kubectl describe pods
Name: my-project-db-7d48567b48-slncd
Namespace: my-project
Priority: 0
PriorityClassName: <none>
Node: ip-192-168-212-194.ec2.internal/192.168.212.194
Start Time: Wed, 19 Dec 2018 15:55:58 +0100
Labels: name=my-db
pod-template-hash=3804123604
Annotations: <none>
Status: Running
IP: 192.168.216.33
Controlled By: ReplicaSet/my-project-db-7d48567b48
Containers:
my-project-db-container:
Container ID: docker://cf8222f15e395b02805c628b6addde2d77de2245aed9406a48c7c6f4dccefd4e
Image: mongo
Image ID: docker-pullable://mongo#sha256:0823cc2000223420f88b20d5e19e6bc252fa328c30d8261070e4645b02183c6a
Port: 27017/TCP
Host Port: 0/TCP
State: Running
Started: Wed, 19 Dec 2018 15:56:42 +0100
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/my_dir from my-project-db-claim0 (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-pf9ks (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
my-project-db-claim0:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: my-project-db-claim0
ReadOnly: false
default-token-pf9ks:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-pf9ks
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 7m22s (x5 over 7m23s) default-scheduler pod has unbound PersistentVolumeClaims (repeated 2 times)
Normal Scheduled 7m21s default-scheduler Successfully assigned my-project/my-project-db-7d48567b48-slncd to ip-192-168-212-194.ec2.internal
Normal SuccessfulMountVolume 7m21s kubelet, ip-192-168-212-194.ec2.internal MountVolume.SetUp succeeded for volume "default-token-pf9ks"
Warning FailedAttachVolume 7m13s (x5 over 7m21s) attachdetach-controller AttachVolume.Attach failed for volume "pvc-307b755a-039e-11e9-b78d-0a68bcb24bc6" : "Error attaching EBS volume \"vol-01a863d0aa7c7e342\"" to instance "i-0a7dafbbdfeabc50b" since volume is in "creating" state
Normal SuccessfulAttachVolume 7m1s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-307b755a-039e-11e9-b78d-0a68bcb24bc6"
Normal SuccessfulMountVolume 6m48s kubelet, ip-192-168-212-194.ec2.internal MountVolume.SetUp succeeded for volume "pvc-307b755a-039e-11e9-b78d-0a68bcb24bc6"
Normal Pulling 6m48s kubelet, ip-192-168-212-194.ec2.internal pulling image "mongo"
Normal Pulled 6m39s kubelet, ip-192-168-212-194.ec2.internal Successfully pulled image "mongo"
Normal Created 6m38s kubelet, ip-192-168-212-194.ec2.internal Created container
Normal Started 6m37s kubelet, ip-192-168-212-194.ec2.internal Started container
Everything is created without any problems. But if I uncomment the lines in the yaml so two volumes are attached to the db deployment, the pv output is the same as earlier, but the pod gets stuck on pending with the following output:
$ kubectl describe pods
Name: my-project-db-b8b8d8bcb-l64d7
Namespace: my-project
Priority: 0
PriorityClassName: <none>
Node: <none>
Labels: name=my-db
pod-template-hash=646484676
Annotations: <none>
Status: Pending
IP:
Controlled By: ReplicaSet/my-project-db-b8b8d8bcb
Containers:
my-project-db-container:
Image: mongo
Port: 27017/TCP
Host Port: 0/TCP
Environment: <none>
Mounts:
/data/db from my-project-db-claim1 (rw)
/my_dir from my-project-db-claim0 (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-pf9ks (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
my-project-db-claim0:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: my-project-db-claim0
ReadOnly: false
my-project-db-claim1:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: my-project-db-claim1
ReadOnly: false
default-token-pf9ks:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-pf9ks
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 60s (x5 over 60s) default-scheduler pod has unbound PersistentVolumeClaims (repeated 2 times)
Warning FailedScheduling 2s (x16 over 59s) default-scheduler 0/2 nodes are available: 2 node(s) had no available volume zone.
I've already read these two issues:
Dynamic volume provisioning creates EBS volume in the wrong availability zone
PersistentVolume on EBS can be created in availability zones with no nodes (Closed)
But I already checked that the volumes are created in the same zones as the cluster nodes instances. In fact, EKS creates two EBS by default in us-east-1b and us-east-1c zones and those volumes works. The volumes created by the posted yaml are on those regions too.
See this article: https://kubernetes.io/blog/2018/10/11/topology-aware-volume-provisioning-in-kubernetes/
The gist is that you want to update your storageclass to include:
volumeBindingMode: WaitForFirstConsumer
This causes the PV to not be created until the pod is scheduled. It fixed a similar problem for me.
Sounds like it's trying to create a volume in an availability zone where you don't have any volumes on. You can try restricting your StorageClass to the availability zones where you have nodes.
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: gp2
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
reclaimPolicy: Retain
mountOptions:
- debug
allowedTopologies:
- matchLabelExpressions:
- key: failure-domain.beta.kubernetes.io/zone
values:
- us-east-1b
- us-east-1c
This is very similar to this question and this answer except that the issue described is on GCP and in this case it's AWS.
In this case, you should check the availability zone of your worker nodes (EC2 instances).
As a Example :
worker node 1 = eu-central-1b
worker node 2 = eu-central-1c
Then create the volume in including one of an availability zone which mentioned above(do not create the volume with eu-central-1a).
after you create the volume, create your PersistentVolume and PersistentVolumeClaim by attaching a newly created volume to your cluster like below.
apiVersion: v1
kind: PersistentVolume
metadata:
labels:
failure-domain.beta.kubernetes.io/region: eu-central-1
failure-domain.beta.kubernetes.io/zone: eu-central-1b
name: mongo-pv
namespace: default
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 100Gi
awsElasticBlockStore:
fsType: ext4
volumeID: aws://eu-central-1b/vol-063342ab9be5d2929
storageClassName: gp2
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mongo-pvc
namespace: default
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: gp2
volumeName: mongo-pv