I have been trying to run kafka/zookeeper on Kubernetes. Using helm charts I am able to install zookeeper on the cluster. However the ZK pods are stuck in pending state. When I issued describe on one of the pod "didn't find available persistent volumes to bind, 1 node(s) had taints that the pod didn't tolerate." was the reason for scheduling failure. But when I issue describe on PVC , I am getting "waiting for first consumer to be created before binding". I tried to re-spawn the whole cluster but the result is same. Trying to use https://kubernetes.io/blog/2018/04/13/local-persistent-volumes-beta/ as guide.
Can someone please guide me here ?
kubectl get pods -n zoo-keeper
kubectl get pods -n zoo-keeper
NAME READY STATUS RESTARTS AGE
zoo-keeper-zk-0 0/1 Pending 0 20m
zoo-keeper-zk-1 0/1 Pending 0 20m
zoo-keeper-zk-2 0/1 Pending 0 20m
kubectl get sc
kubectl get sc
NAME PROVISIONER AGE
local-storage kubernetes.io/no-provisioner 25m
kubectl describe sc
kubectl describe sc
Name: local-storage
IsDefaultClass: No
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{},"name":"local-storage"},"provisioner":"kubernetes.io/no-provisioner","volumeBindingMode":"WaitForFirstConsumer"}
Provisioner: kubernetes.io/no-provisioner
Parameters: <none>
AllowVolumeExpansion: <unset>
MountOptions: <none>
ReclaimPolicy: Delete
VolumeBindingMode: WaitForFirstConsumer
Events: <none>
kubectl describe pod foob-zookeeper-0 -n zoo-keeper
ubuntu#kmaster:~$ kubectl describe pod foob-zookeeper-0 -n zoo-keeper
Name: foob-zookeeper-0
Namespace: zoo-keeper
Priority: 0
PriorityClassName: <none>
Node: <none>
Labels: app=foob-zookeeper
app.kubernetes.io/instance=data-coord
app.kubernetes.io/managed-by=Tiller
app.kubernetes.io/name=foob-zookeeper
app.kubernetes.io/version=foob-zookeeper-9.1.0-15
controller-revision-hash=foob-zookeeper-5321f8ff5
release=data-coord
statefulset.kubernetes.io/pod-name=foob-zookeeper-0
Annotations: foobar.com/product-name: zoo-keeper ZK
foobar.com/product-revision: ABC
Status: Pending
IP:
Controlled By: StatefulSet/foob-zookeeper
Containers:
foob-zookeeper:
Image: repo.data.foobar.se/latest/zookeeper-3.4.10:1.6.0-15
Ports: 2181/TCP, 2888/TCP, 3888/TCP, 10007/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP
Limits:
cpu: 2
memory: 4Gi
Requests:
cpu: 1
memory: 2Gi
Liveness: exec [zkOk.sh] delay=15s timeout=5s period=10s #success=1 #failure=3
Readiness: tcp-socket :2181 delay=15s timeout=5s period=10s #success=1 #failure=3
Environment:
ZK_REPLICAS: 3
ZK_HEAP_SIZE: 1G
ZK_TICK_TIME: 2000
ZK_INIT_LIMIT: 10
ZK_SYNC_LIMIT: 5
ZK_MAX_CLIENT_CNXNS: 60
ZK_SNAP_RETAIN_COUNT: 3
ZK_PURGE_INTERVAL: 1
ZK_LOG_LEVEL: INFO
ZK_CLIENT_PORT: 2181
ZK_SERVER_PORT: 2888
ZK_ELECTION_PORT: 3888
JMXPORT: 10007
Mounts:
/var/lib/zookeeper from datadir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-nfcfx (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
datadir:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: datadir-foob-zookeeper-0
ReadOnly: false
default-token-nfcfx:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-nfcfx
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 69s (x4 over 3m50s) default-scheduler 0/2 nodes are available: 1 node(s) didn't find available persistent volumes to bind, 1 node(s) had taints that the pod didn't tolerate.
kubectl get pv
ubuntu#kmaster:~$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
local-pv 50Gi RWO Retain Available local-storage 10m
ubuntu#kmaster:~$
kubectl get pvc local-claim
ubuntu#kmaster:~$ kubectl get pvc local-claim
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
local-claim Pending local-storage 8m9s
ubuntu#kmaster:~$
kubectl describe pvc local-claim
ubuntu#kmaster:~$ kubectl describe pvc local-claim
Name: local-claim
Namespace: default
StorageClass: local-storage
Status: Pending
Volume:
Labels: <none>
Annotations: <none>
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode: Filesystem
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal WaitForFirstConsumer 2m3s (x26 over 7m51s) persistentvolume-controller waiting for first consumer to be created before binding
Mounted By: <none>
MY PV files:
cat create-pv.yml
apiVersion: v1
kind: PersistentVolume
metadata:
name: local-pv
spec:
capacity:
storage: 50Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: local-storage
local:
path: /mnt/kafka-mount
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- kmaster
cat pvc.yml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: local-claim
spec:
accessModes:
- ReadWriteOnce
storageClassName: local-storage
resources:
requests:
storage: 50Gi
It looks like you created your PV on master node. By default master node is marked unschedulable by ordinary pods using so called taint. To be able to run some service on master node you have two options:
1) Add toleration to some service to allow it to run on master node:
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
You may even specify that some service runs only on master node:
nodeSelector:
node-role.kubernetes.io/master: ""
2) You can remove taint from master node, so any pod can run on it. You should know that this is dangerous because can make your cluster very unstable.
kubectl taint nodes --all node-role.kubernetes.io/master-
Read more here and taints and tolerations: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/
Related
I'm having trouble getting my Kube-registry up and running on cephfs. I'm using rook to set this cluster up. As you can see, I'm having trouble attaching the volume. Any idea what would be causing this issue? any help is appreciated.
kube-registry.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: cephfs-pvc
namespace: kube-system
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
storageClassName: rook-cephfs
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: kube-registry
namespace: kube-system
labels:
k8s-app: kube-registry
kubernetes.io/cluster-service: "true"
spec:
replicas: 3
selector:
matchLabels:
k8s-app: kube-registry
template:
metadata:
labels:
k8s-app: kube-registry
kubernetes.io/cluster-service: "true"
spec:
containers:
- name: registry
image: registry:2
imagePullPolicy: Always
resources:
limits:
cpu: 100m
memory: 100Mi
env:
# Configuration reference: https://docs.docker.com/registry/configuration/
- name: REGISTRY_HTTP_ADDR
value: :5000
- name: REGISTRY_HTTP_SECRET
value: "Ple4seCh4ngeThisN0tAVerySecretV4lue"
- name: REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY
value: /var/lib/registry
volumeMounts:
- name: image-store
mountPath: /var/lib/registry
ports:
- containerPort: 5000
name: registry
protocol: TCP
livenessProbe:
httpGet:
path: /
port: registry
readinessProbe:
httpGet:
path: /
port: registry
volumes:
- name: image-store
persistentVolumeClaim:
claimName: cephfs-pvc
readOnly: false
Storagelass.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-cephfs
# Change "rook-ceph" provisioner prefix to match the operator namespace if needed
provisioner: rook-ceph.cephfs.csi.ceph.com
parameters:
# clusterID is the namespace where operator is deployed.
clusterID: rook-ceph
# CephFS filesystem name into which the volume shall be created
fsName: myfs
# Ceph pool into which the volume shall be created
# Required for provisionVolume: "true"
pool: myfs-data0
# Root path of an existing CephFS volume
# Required for provisionVolume: "false"
# rootPath: /absolute/path
# The secrets contain Ceph admin credentials. These are generated automatically by the operator
# in the same namespace as the cluster.
csi.storage.k8s.io/provisioner-secret-name: rook-csi-cephfs-provisioner
csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
csi.storage.k8s.io/node-stage-secret-name: rook-csi-cephfs-node
csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
reclaimPolicy: Deletea
kubectl describe pods --namespace=kube-system kube-registry-58659ff99b-j2b4d
Name: kube-registry-58659ff99b-j2b4d
Namespace: kube-system
Priority: 0
Node: minikube/192.168.99.212
Start Time: Wed, 25 Nov 2020 13:19:35 -0500
Labels: k8s-app=kube-registry
kubernetes.io/cluster-service=true
pod-template-hash=58659ff99b
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/kube-registry-58659ff99b
Containers:
registry:
Container ID:
Image: registry:2
Image ID:
Port: 5000/TCP
Host Port: 0/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Limits:
cpu: 100m
memory: 100Mi
Requests:
cpu: 100m
memory: 100Mi
Liveness: http-get http://:registry/ delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:registry/ delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
REGISTRY_HTTP_ADDR: :5000
REGISTRY_HTTP_SECRET: Ple4seCh4ngeThisN0tAVerySecretV4lue
REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY: /var/lib/registry
Mounts:
/var/lib/registry from image-store (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-nw4th (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
image-store:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: cephfs-pvc
ReadOnly: false
default-token-nw4th:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-nw4th
Optional: false
QoS Class: Guaranteed
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 13m (x3 over 13m) default-scheduler running "VolumeBinding" filter plugin for pod "kube-registry-58659ff99b-j2b4d": pod has unbound immediate PersistentVolumeClaims
Normal Scheduled 13m default-scheduler Successfully assigned kube-system/kube-registry-58659ff99b-j2b4d to minikube
Warning FailedMount 2m6s (x5 over 11m) kubelet, minikube Unable to attach or mount volumes: unmounted volumes=[image-store], unattached volumes=[image-store default-token-nw4th]: timed out waiting for the condition
Warning FailedAttachVolume 59s (x6 over 11m) attachdetach-controller AttachVolume.Attach failed for volume "pvc-6eeff481-eb0a-4269-84c7-e744c9d639d9" : attachdetachment timeout for volume 0001-0009-rook-c
ceph provisioner logs, I restarted my cluster so the name will be different but output is the same
I1127 18:27:19.370543 1 csi-provisioner.go:121] Version: v2.0.0
I1127 18:27:19.370948 1 csi-provisioner.go:135] Building kube configs for running in cluster...
I1127 18:27:19.429190 1 connection.go:153] Connecting to unix:///csi/csi-provisioner.sock
I1127 18:27:21.561133 1 common.go:111] Probing CSI driver for readiness
W1127 18:27:21.905396 1 metrics.go:142] metrics endpoint will not be started because `metrics-address` was not specified.
I1127 18:27:22.060963 1 leaderelection.go:243] attempting to acquire leader lease rook-ceph/rook-ceph-cephfs-csi-ceph-com...
I1127 18:27:22.122303 1 leaderelection.go:253] successfully acquired lease rook-ceph/rook-ceph-cephfs-csi-ceph-com
I1127 18:27:22.323990 1 controller.go:820] Starting provisioner controller rook-ceph.cephfs.csi.ceph.com_csi-cephfsplugin-provisioner-797b67c54b-42jwc_4e14295b-f73d-4b94-bae9-ff4f2639b487!
I1127 18:27:22.324061 1 clone_controller.go:66] Starting CloningProtection controller
I1127 18:27:22.324205 1 clone_controller.go:84] Started CloningProtection controller
I1127 18:27:22.325240 1 volume_store.go:97] Starting save volume queue
I1127 18:27:22.426790 1 controller.go:869] Started provisioner controller rook-ceph.cephfs.csi.ceph.com_csi-cephfsplugin-provisioner-797b67c54b-42jwc_4e14295b-f73d-4b94-bae9-ff4f2639b487!
I1127 19:08:39.850493 1 controller.go:1317] provision "kube-system/cephfs-pvc" class "rook-cephfs": started
I1127 19:08:39.851034 1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"kube-system", Name:"cephfs-pvc", UID:"7c47bda7-0c7b-4ca0-b6d0-19d717ef2e06", APIVersion:"v1", ResourceVersion:"7744", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "kube-system/cephfs-pvc"
I1127 19:08:43.670226 1 controller.go:1420] provision "kube-system/cephfs-pvc" class "rook-cephfs": volume "pvc-7c47bda7-0c7b-4ca0-b6d0-19d717ef2e06" provisioned
I1127 19:08:43.670262 1 controller.go:1437] provision "kube-system/cephfs-pvc" class "rook-cephfs": succeeded
E1127 19:08:43.692108 1 controller.go:1443] couldn't create key for object pvc-7c47bda7-0c7b-4ca0-b6d0-19d717ef2e06: object has no meta: object does not implement the Object interfaces
I1127 19:08:43.692189 1 controller.go:1317] provision "kube-system/cephfs-pvc" class "rook-cephfs": started
I1127 19:08:43.692205 1 controller.go:1326] provision "kube-system/cephfs-pvc" class "rook-cephfs": persistentvolume "pvc-7c47bda7-0c7b-4ca0-b6d0-19d717ef2e06" already exists, skipping
I1127 19:08:43.692220 1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"kube-system", Name:"cephfs-pvc", UID:"7c47bda7-0c7b-4ca0-b6d0-19d717ef2e06", APIVersion:"v1", ResourceVersion:"7744", FieldPath:""}): type: 'Normal' reason: 'ProvisioningSucceeded' Successfully provisioned
In the pasted YAML for your StorageClass, you have:
reclaimPolicy: Deletea
Was that a paste issue? Regardless, this is likely what is causing your problem.
I just had this exact problem with some of my Ceph RBD volumes, and the reason for it was that I was using a StorageClass that had
reclaimPolicy: Delete
However, the cephcsi driver was not configured to support it (and I don't think it actually supports it either).
Using a StorageClass with
reclaimPolicy: Retain
fixed the issue.
To check this on your cluster, run the following:
$ kubectl get sc rook-cephfs -o yaml
And look for the line that starts with reclaimPolicy:
Then, look at the csidriver your StorageClass is using. In your case it is rook-ceph.cephfs.csi.ceph.com
$ kubectl get csidriver rook-ceph.cephfs.csi.ceph.com -o yaml
And look for the entries under volumeLifecycleModes
apiVersion: storage.k8s.io/v1beta1
kind: CSIDriver
metadata:
creationTimestamp: "2020-11-16T22:18:55Z"
name: rook-ceph.cephfs.csi.ceph.com
resourceVersion: "29863971"
selfLink: /apis/storage.k8s.io/v1beta1/csidrivers/rook-ceph.cephfs.csi.ceph.com
uid: a9651d30-935d-4a7d-a7c9-53d5bc90c28c
spec:
attachRequired: true
podInfoOnMount: false
volumeLifecycleModes:
- Persistent
If the only entry under volumeLifecycleModes is Persistent, then your driver is not configured to support reclaimPolicy: Delete.
If instead you see
volumeLifecycleModes:
- Persistent
- Ephemeral
Then your driver should support reclaimPolicy: Delete
I'm trying to install rabbit-mq using helm, but installation fails because of volume issues.
This is my storage class:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: Immediate
This is my persistent volume:
apiVersion: v1
kind: PersistentVolume
metadata:
name: main-pv
spec:
capacity:
storage: 100Gi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Delete
storageClassName: local-storage
local:
path: /media/2TB-DATA/k8s-pv
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- node-dev
This is the output to list my storage and pv:
# kubectl get storageclass
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
local-storage (default) kubernetes.io/no-provisioner Delete Immediate false 14m
# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
main-pv 100Gi RWX Delete Available local-storage 40m
After I install rabbit-mq:
helm install rabbitmq bitnami/rabbitmq
The pod is in Pending state, and I see this error:
# kubectl describe pvc
Name: data-rabbitmq-0
Namespace: default
StorageClass:
Status: Pending
Volume:
Labels: app.kubernetes.io/instance=rabbitmq
app.kubernetes.io/name=rabbitmq
Annotations: <none>
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode: Filesystem
Mounted By: rabbitmq-0
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal FailedBinding 3m20s (x4363 over 18h) persistentvolume-controller no persistent volumes available for this claim and no storage class is set
What am I doing wrong?
Maybe platform related. Where did you try to do that? Im asking cause just cant reproduce on GKE - it works fine
Cluster version, labels, nodes
kubectl get nodes --show-labels
NAME STATUS ROLES AGE VERSION LABELS
gke-cluster-1-default-pool-82008fd9-8x81 Ready <none> 96d v1.14.10-gke.36 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/fluentd-ds-ready=true,beta.kubernetes.io/instance-type=n1-standard-1,beta.kubernetes.io/os=linux,cloud.google.com/gke-nodepool=default-pool,cloud.google.com/gke-os-distribution=cos,failure-domain.beta.kubernetes.io/region=europe-west4,failure-domain.beta.kubernetes.io/zone=europe-west4-b,kubernetes.io/arch=amd64,kubernetes.io/hostname=gke-cluster-1-default-pool-82008fd9-8x81,kubernetes.io/os=linux,test=node
gke-cluster-1-default-pool-82008fd9-qkp7 Ready <none> 96d v1.14.10-gke.36 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/fluentd-ds-ready=true,beta.kubernetes.io/instance-type=n1-standard-1,beta.kubernetes.io/os=linux,cloud.google.com/gke-nodepool=default-pool,cloud.google.com/gke-os-distribution=cos,failure-domain.beta.kubernetes.io/region=europe-west4,failure-domain.beta.kubernetes.io/zone=europe-west4-b,kubernetes.io/arch=amd64,kubernetes.io/hostname=gke-cluster-1-default-pool-82008fd9-qkp7,kubernetes.io/os=linux,test=node
gke-cluster-1-default-pool-82008fd9-tlc7 Ready <none> 96d v1.14.10-gke.36 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/fluentd-ds-ready=true,beta.kubernetes.io/instance-type=n1-standard-1,beta.kubernetes.io/os=linux,cloud.google.com/gke-nodepool=default-pool,cloud.google.com/gke-os-distribution=cos,failure-domain.beta.kubernetes.io/region=europe-west4,failure-domain.beta.kubernetes.io/zone=europe-west4-b,kubernetes.io/arch=amd64,kubernetes.io/hostname=gke-cluster-1-default-pool-82008fd9-tlc7,kubernetes.io/os=linux,test=node
PV, Storageclass
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: Immediate
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: main-pv
spec:
capacity:
storage: 10Gi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Delete
storageClassName: local-storage
local:
path: /mnt/data
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: test
operator: In
values:
- node-test
Installing chart:
helm install rabbitmq bitnami/rabbitmq
...
kubectl get pods
NAME READY STATUS RESTARTS AGE
...
pod/rabbitmq-0 1/1 Running 0 3m40s
...
kubectl describe pod rabbitmq-0
Name: rabbitmq-0
Namespace: default
Priority: 0
Node: gke-cluster-1-default-pool-82008fd9-tlc7/10.164.0.29
Start Time: Thu, 03 Sep 2020 07:34:10 +0000
Labels: app.kubernetes.io/instance=rabbitmq
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=rabbitmq
controller-revision-hash=rabbitmq-8687f4cb9f
helm.sh/chart=rabbitmq-7.6.4
statefulset.kubernetes.io/pod-name=rabbitmq-0
Annotations: checksum/secret: 433e8ea7590e8d9f1bb94ed2f55e6d9b95f8abef722a917b97a9e916921d7ac5
kubernetes.io/limit-ranger: LimitRanger plugin set: cpu request for container rabbitmq
Status: Running
IP: 10.16.2.13
IPs: <none>
Controlled By: StatefulSet/rabbitmq
Containers:
rabbitmq:
Container ID: docker://b1a567522f50ac4c0663db2d9eca5fd8721d9a3d900ac38bb58f0cae038162f2
Image: docker.io/bitnami/rabbitmq:3.8.7-debian-10-r0
Image ID: docker-pullable://bitnami/rabbitmq#sha256:9abd53aeef6d222fec318c97a75dd50ce19c16b11cb83a3e4fb91c4047ea0d4d
Ports: 5672/TCP, 25672/TCP, 15672/TCP, 4369/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP
State: Running
Started: Thu, 03 Sep 2020 07:34:34 +0000
Ready: True
Restart Count: 0
Requests:
cpu: 100m
Liveness: exec [/bin/bash -ec rabbitmq-diagnostics -q check_running] delay=120s timeout=20s period=30s #success=1 #failure=6
Readiness: exec [/bin/bash -ec rabbitmq-diagnostics -q check_running] delay=10s timeout=20s period=30s #success=1 #failure=3
Environment:
BITNAMI_DEBUG: false
MY_POD_IP: (v1:status.podIP)
MY_POD_NAME: rabbitmq-0 (v1:metadata.name)
MY_POD_NAMESPACE: default (v1:metadata.namespace)
K8S_SERVICE_NAME: rabbitmq-headless
K8S_ADDRESS_TYPE: hostname
RABBITMQ_FORCE_BOOT: no
RABBITMQ_NODE_NAME: rabbit#$(MY_POD_NAME).$(K8S_SERVICE_NAME).$(MY_POD_NAMESPACE).svc.cluster.local
K8S_HOSTNAME_SUFFIX: .$(K8S_SERVICE_NAME).$(MY_POD_NAMESPACE).svc.cluster.local
RABBITMQ_MNESIA_DIR: /bitnami/rabbitmq/mnesia/$(RABBITMQ_NODE_NAME)
RABBITMQ_LDAP_ENABLE: no
RABBITMQ_LOGS: -
RABBITMQ_ULIMIT_NOFILES: 65536
RABBITMQ_USE_LONGNAME: true
RABBITMQ_ERL_COOKIE: <set to the key 'rabbitmq-erlang-cookie' in secret 'rabbitmq'> Optional: false
RABBITMQ_USERNAME: user
RABBITMQ_PASSWORD: <set to the key 'rabbitmq-password' in secret 'rabbitmq'> Optional: false
RABBITMQ_PLUGINS: rabbitmq_management, rabbitmq_peer_discovery_k8s, rabbitmq_auth_backend_ldap
Mounts:
/bitnami/rabbitmq/conf from configuration (rw)
/bitnami/rabbitmq/mnesia from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from rabbitmq-token-mclhw (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: data-rabbitmq-0
ReadOnly: false
configuration:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: rabbitmq-config
Optional: false
rabbitmq-token-mclhw:
Type: Secret (a volume populated by a Secret)
SecretName: rabbitmq-token-mclhw
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 6m42s default-scheduler Successfully assigned default/rabbitmq-0 to gke-cluster-1-default-pool-82008fd9-tlc7
Normal SuccessfulAttachVolume 6m36s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-8145821b-ed09-11ea-b464-42010aa400e3"
Normal Pulling 6m32s kubelet, gke-cluster-1-default-pool-82008fd9-tlc7 Pulling image "docker.io/bitnami/rabbitmq:3.8.7-debian-10-r0"
Normal Pulled 6m22s kubelet, gke-cluster-1-default-pool-82008fd9-tlc7 Successfully pulled image "docker.io/bitnami/rabbitmq:3.8.7-debian-10-r0"
Normal Created 6m18s kubelet, gke-cluster-1-default-pool-82008fd9-tlc7 Created container rabbitmq
Normal Started 6m18s kubelet, gke-cluster-1-default-pool-82008fd9-tlc7 Started container rabbitmq
I tried to use helm on docker for windows on the local machine. When I used a storage class as local storage, persistent volume, and persistent volume claim without helm, it works fine. But when I used this setting with helm, CrashLoopBackOff happened.
localStrageClass.yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: local-pv002
labels:
type: local
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteMany
#storageClassName: hostpath
mountOptions:
- hard
persistentVolumeReclaimPolicy: Delete
storageClassName: local-storage
local:
path: /c/k/share/mysql
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
#- key: docker.io/hostname
- key: kubernetes.io/hostname
operator: In
values:
- docker-desktop
pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: local-mysql-claim
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
storageClassName: local-storage
mysqlConf.yaml
persistence:
enabled: true
storageClass: local-storage
existingClaim: local-mysql-claim
accessMode: ReadWriteOnce
size: 1Gi
annotations: {}
$ helm install --name mysql stable/mysql -f mysqlConf.yaml
$ kubectl describe pod mysql
Containers:
mysql:
Container ID: docker://533e4569603b05fac83a0a701da97898b3190503618796678ac5db6340c4dce6
Image: mysql:5.7.14
Image ID: docker-pullable://mysql#sha256:c8f03238ca1783d25af320877f063a36dbfce0daa56a7b4955e6c6e05ab5c70b
Port: 3306/TCP
Host Port: 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Thu, 28 Mar 2019 13:24:25 +0900
Finished: Thu, 28 Mar 2019 13:24:25 +0900
Ready: False
Restart Count: 2
Requests:
cpu: 100m
memory: 256Mi
Liveness: exec [sh -c mysqladmin ping -u root -p${MYSQL_ROOT_PASSWORD}] delay=30s timeout=5s period=10s #success=1 #failure=3
Readiness: exec [sh -c mysqladmin ping -u root -p${MYSQL_ROOT_PASSWORD}] delay=5s timeout=1s period=10s #success=1 #failure=3
Environment:
MYSQL_ROOT_PASSWORD: <set to the key 'mysql-root-password' in secret 'mysql'> Optional: false
MYSQL_PASSWORD: <set to the key 'mysql-password' in secret 'mysql'> Optional: true
MYSQL_USER:
MYSQL_DATABASE:
Mounts:
/var/lib/mysql from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-dccpv (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: local-mysql-claim
ReadOnly: false
default-token-dccpv:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-dccpv
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 39s default-scheduler Successfully assigned default/mysql-698897ff79-n768k to docker-desktop
Normal Pulled 38s kubelet, docker-desktop Container image "busybox:1.29.3" already present on machine
Normal Created 38s kubelet, docker-desktop Created container
Normal Started 38s kubelet, docker-desktop Started container
Normal Pulled 18s (x3 over 37s) kubelet, docker-desktop Container image "mysql:5.7.14" already present on machine
Normal Created 17s (x3 over 37s) kubelet, docker-desktop Created container
Normal Started 17s (x3 over 37s) kubelet, docker-desktop Started container
Warning BackOff 13s (x5 over 35s) kubelet, docker-desktop Back-off restarting failed container
When storageClassName was hostpath or did not used the configuration file as
$ helm install --name mysql stable/mysql
it worked fine.
Please tell me how to fix this problem.
I think you are having a mismatch of accessModes between what you claim in PVC definition (ReadWriteOnce) and what your Storage Class offers (ReadWriteMany).
Please mind also that PersistentVolume(s) of HostPath type does not support ReadWriteMany mode (see spec here).
I would propose you to create PV similar to this one:
# Create PV of manual StorageClass
kind: PersistentVolume
apiVersion: v1
metadata:
name: task-pv-volume
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/C/Users/K8S/mysql"
and override default PVC storageClassName configuration during helm install like this:
helm install --name my-sql stable/mysql --set persistence.storageClass=manual
I have my own cluster k8s. I'm trying to link the cluster to openstack / cinder.
When I'm creating a PVC, I can a see a PV in k8s and the volume in Openstack.
But when I'm linking a pod with the PVC, I have the message k8s - Cinder "0/x nodes are available: x node(s) had volume node affinity conflict".
My yml test:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: classic
provisioner: kubernetes.io/cinder
parameters:
type: classic
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pvc-infra-consuldata4
namespace: infra
spec:
storageClassName: classic
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: consul
namespace: infra
labels:
app: consul
spec:
replicas: 1
selector:
matchLabels:
app: consul
template:
metadata:
labels:
app: consul
spec:
containers:
- name: consul
image: consul:1.4.3
volumeMounts:
- name: data
mountPath: /consul
resources:
requests:
cpu: 100m
limits:
cpu: 500m
command: ["consul", "agent", "-server", "-bootstrap", "-ui", "-bind", "0.0.0.0", "-client", "0.0.0.0", "-data-dir", "/consul"]
volumes:
- name: data
persistentVolumeClaim:
claimName: pvc-infra-consuldata4
The result:
kpro describe pvc -n infra
Name: pvc-infra-consuldata4
Namespace: infra
StorageClass: classic
Status: Bound
Volume: pvc-76bfdaf1-40bb-11e9-98de-fa163e53311c
Labels:
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{},"name":"pvc-infra-consuldata4","namespace":"infra"},"spec":...
pv.kubernetes.io/bind-completed: yes
pv.kubernetes.io/bound-by-controller: yes
volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/cinder
Finalizers: [kubernetes.io/pvc-protection]
Capacity: 1Gi
Access Modes: RWO
VolumeMode: Filesystem
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ProvisioningSucceeded 61s persistentvolume-controller Successfully provisioned volume pvc-76bfdaf1-40bb-11e9-98de-fa163e53311c using kubernetes.io/cinder
Mounted By: consul-85684dd7fc-j84v7
kpro describe po -n infra consul-85684dd7fc-j84v7
Name: consul-85684dd7fc-j84v7
Namespace: infra
Priority: 0
PriorityClassName: <none>
Node: <none>
Labels: app=consul
pod-template-hash=85684dd7fc
Annotations: <none>
Status: Pending
IP:
Controlled By: ReplicaSet/consul-85684dd7fc
Containers:
consul:
Image: consul:1.4.3
Port: <none>
Host Port: <none>
Command:
consul
agent
-server
-bootstrap
-ui
-bind
0.0.0.0
-client
0.0.0.0
-data-dir
/consul
Limits:
cpu: 2
Requests:
cpu: 500m
Environment: <none>
Mounts:
/consul from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-nxchv (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: pvc-infra-consuldata4
ReadOnly: false
default-token-nxchv:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-nxchv
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 36s (x6 over 2m40s) default-scheduler 0/6 nodes are available: 6 node(s) had volume node affinity conflict.
Why K8s successful create the Cinder volume, but it can't schedule the pod ?
Try finding out the nodeAffinity of your persistent volume:
$ kubctl describe pv pvc-76bfdaf1-40bb-11e9-98de-fa163e53311c
Node Affinity:
Required Terms:
Term 0: kubernetes.io/hostname in [xxx]
Then try to figure out if xxx matches the node label yyy that your pod is supposed to run on:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
yyy Ready worker 8d v1.15.3
If they don't match, you will have the "x node(s) had volume node affinity conflict" error, and you need to re-create the persistent volume with the correct nodeAffinity configuration.
I also ran into this issue when I forgot to deploy the EBS CSI driver before I tried to get my pod to connect to it.
kubectl apply -k "github.com/kubernetes-sigs/aws-ebs-csi-driver/deploy/kubernetes/overlays/stable/?ref=master"
I am having issues deploying juypterhub on kubernetes cluster. The issue I am getting is that the hub pod is stuck in pending.
Stack:
kubeadm
flannel
weave
helm
jupyterhub
Runbook:
$kubeadm init --pod-network-cidr="10.244.0.0/16"
$sudo cp /etc/kubernetes/admin.conf $HOME/ && sudo chown $(id -u):$(id -g) $HOME/admin.conf && export KUBECONFIG=$HOME/admin.conf
$kubectl create -f pvc.yml
$kubectl create -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel-aliyun.yml
$kubectl apply --filename https://git.io/weave-kube-1.6
$kubectl taint nodes --all node-role.kubernetes.io/master-
Helm installations as per https://zero-to-jupyterhub.readthedocs.io/en/latest/setup-helm.html
Jupyter installations as per https://zero-to-jupyterhub.readthedocs.io/en/latest/setup-jupyterhub.html
config.yml
proxy:
secretToken: "asdf"
singleuser:
storage:
dynamic:
storageClass: local-storage
pvc.yml
apiVersion: v1
kind: PersistentVolume
metadata:
name: standard
spec:
capacity:
storage: 100Gi
# volumeMode field requires BlockVolume Alpha feature gate to be enabled.
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Delete
storageClassName: local-storage
local:
path: /dev/vdb
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- example-node
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: standard
spec:
storageClassName: local-storage
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3Gi
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
The warning is:
$kubectl --namespace=jhub get pod
NAME READY STATUS RESTARTS AGE
hub-fb48dfc4f-mqf4c 0/1 Pending 0 3m33s
proxy-86977cf9f7-fqf8d 1/1 Running 0 3m33s
$kubectl --namespace=jhub describe pod hub
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 35s (x3 over 35s) default-scheduler pod has unbound immediate PersistentVolumeClaims
$kubectl --namespace=jhub describe pv
Name: standard
Labels: type=local
Annotations: pv.kubernetes.io/bound-by-controller: yes
Finalizers: [kubernetes.io/pv-protection]
StorageClass: manual
Status: Bound
Claim: default/standard
Reclaim Policy: Retain
Access Modes: RWO
VolumeMode: Filesystem
Capacity: 10Gi
Node Affinity: <none>
Message:
Source:
Type: HostPath (bare host directory volume)
Path: /dev/vdb
HostPathType:
Events: <none>
$kubectl --namespace=kube-system describe pvc
Name: hub-db-dir
Namespace: jhub
StorageClass:
Status: Pending
Volume:
Labels: app=jupyterhub
chart=jupyterhub-0.8.0-beta.1
component=hub
heritage=Tiller
release=jhub
Annotations: <none>
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode: Filesystem
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal FailedBinding 13s (x7 over 85s) persistentvolume-controller no persistent volumes available for this claim and no storage class is set
Mounted By: hub-fb48dfc4f-mqf4c
I tried my best to follow the localstorage volume configuration on the official kubernetes website, but with no luck
-G
Managed to fix it using the following configuration.
Key points:
- I forgot to add the node in nodeAffinity
- it works without putting in volumeBindingMode
apiVersion: v1
kind: PersistentVolume
metadata:
name: standard
spec:
capacity:
storage: 2Gi
# volumeMode field requires BlockVolume Alpha feature gate to be enabled.
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: local-storage
local:
path: /temp
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- INSERT_NODE_NAME_HERE
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
annotations:
storageclass.kubernetes.io/is-default-class: "true"
name: local-storage
provisioner: kubernetes.io/no-provisioner
config.yaml
proxy:
secretToken: "token"
singleuser:
storage:
dynamic:
storageClass: local-storage
make sure your storage/pv looks like this:
root#asdf:~# kubectl --namespace=kube-system describe pv
Name: standard
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"PersistentVolume","metadata":{"annotations":{},"name":"standard"},"spec":{"accessModes":["ReadWriteOnce"],"capa...
pv.kubernetes.io/bound-by-controller: yes
Finalizers: [kubernetes.io/pv-protection]
StorageClass: local-storage
Status: Bound
Claim: jhub/hub-db-dir
Reclaim Policy: Retain
Access Modes: RWO
VolumeMode: Filesystem
Capacity: 2Gi
Node Affinity:
Required Terms:
Term 0: kubernetes.io/hostname in [asdf]
Message:
Source:
Type: LocalVolume (a persistent volume backed by local storage on a node)
Path: /temp
Events: <none>
root#asdf:~# kubectl --namespace=kube-system describe storageclass
Name: local-storage
IsDefaultClass: Yes
Annotations: storageclass.kubernetes.io/is-default-class=true
Provisioner: kubernetes.io/no-provisioner
Parameters: <none>
AllowVolumeExpansion: <unset>
MountOptions: <none>
ReclaimPolicy: Delete
VolumeBindingMode: Immediate
Events: <none>
Now the hub pod looks something like this:
root#asdf:~# kubectl --namespace=jhub describe pod hub
Name: hub-5d4fcd8fd9-p6crs
Namespace: jhub
Priority: 0
PriorityClassName: <none>
Node: asdf/192.168.0.87
Start Time: Sat, 23 Feb 2019 14:29:51 +0800
Labels: app=jupyterhub
component=hub
hub.jupyter.org/network-access-proxy-api=true
hub.jupyter.org/network-access-proxy-http=true
hub.jupyter.org/network-access-singleuser=true
pod-template-hash=5d4fcd8fd9
release=jhub
Annotations: checksum/config-map: --omitted
checksum/secret: --omitted--
Status: Running
IP: 10.244.0.55
Controlled By: ReplicaSet/hub-5d4fcd8fd9
Containers:
hub:
Container ID: docker://d2d4dec8cc16fe21589e67f1c0c6c6114b59b01c67a9f06391830a1ea711879d
Image: jupyterhub/k8s-hub:0.8.0
Image ID: docker-pullable://jupyterhub/k8s-hub#sha256:e40cfda4f305af1a2fdf759cd0dcda834944bef0095c8b5ecb7734d19f58b512
Port: 8081/TCP
Host Port: 0/TCP
Command:
jupyterhub
--config
/srv/jupyterhub_config.py
--upgrade-db
State: Running
Started: Sat, 23 Feb 2019 14:30:28 +0800
Ready: True
Restart Count: 0
Requests:
cpu: 200m
memory: 512Mi
Environment:
PYTHONUNBUFFERED: 1
HELM_RELEASE_NAME: jhub
POD_NAMESPACE: jhub (v1:metadata.namespace)
CONFIGPROXY_AUTH_TOKEN: <set to the key 'proxy.token' in secret 'hub-secret'> Optional: false
Mounts:
/etc/jupyterhub/config/ from config (rw)
/etc/jupyterhub/secret/ from secret (rw)
/srv/jupyterhub from hub-db-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from hub-token-bxzl7 (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: hub-config
Optional: false
secret:
Type: Secret (a volume populated by a Secret)
SecretName: hub-secret
Optional: false
hub-db-dir:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: hub-db-dir
ReadOnly: false
hub-token-bxzl7:
Type: Secret (a volume populated by a Secret)
SecretName: hub-token-bxzl7
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events: <none>