I've been trying to find out why my pod won't start up, when I go to describe it I get this:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m default-scheduler Successfully assigned my-namespace/nfs to gke-default-pool
Normal SuccessfulAttachVolume 2m attachdetach-controller AttachVolume.Attach succeeded for volume "nfspvc"
Warning FailedMount 50s kubelet, default-pool Unable to mount volumes for pod "nfs-8496cc5fd5-wjkm2_sxdb-branch161666(28ff323d-8839-11e9-a080-42010a8400bb)": timeout expired waiting for volumes to attach or mount for pod "my-namespace"/"nfs-8496cc5fd5-wjkm2". list of unmounted volumes=[nfspvc]. list of unattached volumes=[nfspvc default-token-ntmfv]
Warning FailedMount 28s (x9 over 2m) kubelet, default-pool MountVolume.MountDevice failed for volume "nfspvc" : executable file not found in $PATH
When I describe the compute disk, all looks good:
creationTimestamp: '2019-06-06T01:56:31.079-07:00'
id: '5701286856735681489'
kind: compute#disk
labelFingerprint: 42WmSpB8rSM=
lastAttachTimestamp: '2019-06-06T01:57:51.852-07:00'
name: nfs-pd
physicalBlockSizeBytes: '4096'
selfLink: <omitted>
sizeGb: '10'
status: READY
type: <omitted>
users:
- <omitted>
zone: <omitted>
Updates are available for some Cloud SDK components. To install them,
please run:
$ gcloud components update
And here is my pods manifest:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
labels:
role: nfs
name: nfs
namespace: my-namespace
spec:
replicas: 1
selector:
matchLabels:
role: nfs
spec:
containers:
- image: gcr.io/google_containers/volume-nfs:0.8
name: nfs
ports:
- containerPort: 2049
name: nfs
protocol: TCP
- containerPort: 20048
name: mountd
protocol: TCP
- containerPort: 111
name: rpcbind
protocol: TCP
volumeMounts:
- mountPath: /exports
name: nfspvc
restartPolicy: Always
volumes:
- gcePersistentDisk:
fsType: ext4i
pdName: nfs-pd
name: nfspvc
I'm not really sure what 'MountVolume.MountDevice failed for volume "nfspvc" : executable file not found in $PATH' means, or what else I should look in to, to investigate the source of the issue?
If it makes any difference, this is created by a script, the ordering of which is:
Create the compute disk
Run helm, which in turn creates the above deployment, service, persistent volume, and persistent volume claim.
fsType: ext4i
Try removing that 'i'.
Related
I'm using Kubernetes - v1.24.7 on Ubuntu 18.04.6 LTS and facing problem with the NFS - Persistent Volume mount. When i tried to deploy my Jenkins deployment file it always fails with below errors.
$ kubectl describe pod jenkins-6786789d5d-m26zw -n jenkins
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 25m default-scheduler Successfully assigned jenkins/jenkins-6786789d5d-m26zw to worker-3
Warning FailedMount 5m31s (x2 over 14m) kubelet Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[kube-api-access-65npd data]: timed out waiting for the condition
Warning FailedMount 3m17s (x8 over 23m) kubelet Unable to attach or mount volumes: unmounted volumes=[data], unattached volumes=[data kube-api-access-65npd]: timed out waiting for the condition
Warning FailedMount 3m6s (x19 over 25m) kubelet MountVolume.SetUp failed for volume "pv-nfs" : mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t nfs -o nfsvers=4.1 192.168.72.136:/mnt/nfs/stg/jenkins /var/lib/kubelet/pods/853c44ed-bf2b-4e6a-b666-c1adab7f7f4b/volumes/kubernetes.io~nfs/pv-nfs
Output: mount.nfs: mounting 192.168.72.136:/mnt/nfs/stg/jenkins failed, reason given by server: No such file or directory
The below External NFS mount path provided by our IT-Storage Administrator.
192.168.72.136:/nfs-volume
The below packages have already been installed on master and nodes.
apt install nfs-common
apt install cifs-utils
apt install nfs-kernel-server
In my master and workers(Host Machine) i have added below in /etc/fstab and i could mount the nfs volume.
192.168.72.136:/nfs-volume /mnt/nfs/stg/ nfs defaults 0 0
However still same problem persisting while Kubernetes application deployment, Also tried with below option in /etc/fstab but same result.
192.168.72.136:/nfs-volume /mnt/nfs/stg/ nfs rw,hard,intr 0 0
My pv & pvc volume status.
$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pv-nfs 100Gi RWX Retain Bound jenkins/pvc-nfs nfs 11s
$ kubectl get pvc -n jenkins
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
pvc-nfs Bound pv-nfs 100Gi RWX nfs 17s
My PersistentVolume and Deployment yml as follows.
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-nfs
labels:
type: pv-nfs
spec:
storageClassName: nfs
capacity:
storage: 100Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
mountOptions:
- nfsvers=4.0
nfs:
server: 192.168.72.136
path: "/mnt/nfs/stg/jenkins"
readOnly: false
apiVersion: apps/v1
kind: Deployment
metadata:
name: jenkins
namespace: jenkins
labels:
app: jenkins
spec:
replicas: 1
selector:
matchLabels:
app: jenkins
template:
metadata:
labels:
app: jenkins
spec:
securityContext:
fsGroup: 0
runAsUser: 0
serviceAccountName: admin
containers:
- name: jenkins
image: jenkins/jenkins:latest
securityContext:
privileged: true
runAsUser: 0
ports:
- containerPort: 8080
volumeMounts:
- name: data
mountPath: /var/jenkins_home
volumes:
- name: data
persistentVolumeClaim:
claimName: pvc-nfs
Directory /mnt/nfs/stg/jenkins existing in NFS. Please let me know what I'm missing here?
Thanks for helping.
When the storage IT administrator has exported NFS share: /nfs-volume from 192.168.72.136, then in the PersistentVolume spec, the path should be /nfs-volume.
I am running k3s cluster on RPis with Ubuntu Server 20.04 and have a drive mounted onto the master node to serve as shared NFS storage.
For some reason I can only access the server using the ClusterIP of the service, but no the name. CoreDNS is running and the logs suggest that it's working fine, but it's not used by the test pod that I try to mount the NFS share onto.
There is a question for Raspbian case, but I am running Ubuntu, not sure if using legacy iptables as suggested there will work or break things further.
Any help is appreciated! Details below.
NFS server manifests:
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nfs-server-pod
namespace: storage
labels:
nfs-role: server
spec:
replicas: 1
strategy:
type: Recreate
selector:
matchLabels:
nfs-role: server
template:
metadata:
labels:
nfs-role: server
spec:
nodeSelector:
storage: enabled
containers:
- name: nfs-server-container
image: ghcr.io/two-tauers/nfs-server:0.1.14
securityContext:
privileged: true
args:
- /storage
volumeMounts:
- name: storage-mount
mountPath: /storage
volumes:
- name: storage-mount
hostPath:
path: /storage
---
kind: Service
apiVersion: v1
metadata:
name: nfs-server
namespace: storage
spec:
selector:
nfs-role: server
type: ClusterIP
ports:
- name: tcp-2049
port: 2049
protocol: TCP
- name: udp-111
port: 111
❯ kubectl logs pod/nfs-server-pod-ccd9c4877-lgd4x -n storage
* Exporting directories for NFS kernel daemon...
...done.
* Starting NFS kernel daemon
...done.
Setting up watches.
Watches established.
❯ kubectl get service/nfs-server -n storage
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
nfs-server ClusterIP 10.43.223.27 <none> 2049/TCP,111/TCP 102m
Test pod, not in a running state:
---
kind: Pod
apiVersion: v1
metadata:
name: test-nfs
namespace: storage
spec:
volumes:
- name: nfs-volume
nfs:
server: nfs-server # doesn't work
### server: nfs-server.storage.svc.cluster.local # doesn't work
### server: 10.43.223.27 # works
path: /
containers:
- name: test
image: alpine
volumeMounts:
- name: nfs-volume
mountPath: /var/nfs
command: ["/bin/sh"]
args: ["-c", "while true; do date >> /var/nfs/dates.txt; sleep 5; done"]
Events:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 51m default-scheduler Successfully assigned storage/test-nfs to sauron
Warning FailedMount 9m26s (x14 over 48m) kubelet MountVolume.SetUp failed for volume "nfs-volume" : mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t nfs nfs-server:/ /var/lib/kubelet/pods/d1523d5f-60ec-4990-ae61-369a26b7a6f4/volumes/kubernetes.io~nfs/nfs-volume
Output: mount.nfs: Connection timed out
Warning FailedMount 8m12s (x15 over 49m) kubelet Unable to attach or mount volumes: unmounted volumes=[nfs-volume], unattached volumes=[nfs-volume kube-api-access-pc9bk]: timed out waiting for the condition
Warning FailedMount 3m44s (x5 over 46m) kubelet Unable to attach or mount volumes: unmounted volumes=[nfs-volume], unattached volumes=[kube-api-access-pc9bk nfs-volume]: timed out waiting for the condition
At the same time, nslookup works and coredns does produce a lookup log:
---
apiVersion: v1
kind: Pod
metadata:
name: dnsutils
namespace: storage
spec:
containers:
- name: dnsutils
image: k8s.gcr.io/e2e-test-images/jessie-dnsutils:1.3
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
restartPolicy: Always
❯ kubectl exec -i -t dnsutils -n storage -- nslookup nfs-server && kubectl logs pod/coredns-84c56f7bfb-66s84 -n kube-system | tail -1
Server: 10.43.0.10
Address: 10.43.0.10#53
Name: nfs-server.storage.svc.cluster.local
Address: 10.43.223.27
[INFO] 10.42.1.77:37937 - 21822 "A IN nfs-server.storage.svc.cluster.local. udp 54 false 512" NOERROR qr,aa,rd 106 0.000980753s
❯ kubectl exec -i -t dnsutils -n storage -- nslookup kubernetes.default.svc.cluster.local && kubectl logs pod/coredns-84c56f7bfb-66s84 -n kube-system | tail -1
Server: 10.43.0.10
Address: 10.43.0.10#53
Name: kubernetes.default.svc.cluster.local
Address: 10.43.0.1
[INFO] 10.42.1.77:59077 - 58999 "A IN kubernetes.default.svc.cluster.local. udp 54 false 512" NOERROR qr,aa,rd 106 0.000496515s
I am trying to get a volume mounted as a non-root user in one of my containers. I'm trying an approach from this SO post using an initContainer to set the correct user, but when I try to start the configuration I get an "unbound immediate PersistentVolumneClaims" error. I suspect it's because the volume is mounted in both my initContainer and container, but I'm not sure why that would be the issue: I can see the initContainer taking the claim, but I would have thought when it exited that it would release it, letting the normal container take the claim. Any ideas or alternatives to getting the directory mounted as a non-root user? I did try using securityContext/fsGroup, but that seemed to have no effect. The /var/rdf4j directory below is the one that is being mounted as root.
Configuration:
apiVersion: v1
kind: PersistentVolume
metadata:
name: triplestore-data-storage-dir
labels:
type: local
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteMany
storageClassName: local-storage
volumeMode: Filesystem
persistentVolumeReclaimPolicy: Delete
hostPath:
path: /run/desktop/mnt/host/d/workdir/k8s-data/triplestore
type: DirectoryOrCreate
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: triplestore-data-storage
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
storageClassName: local-storage
volumeName: "triplestore-data-storage-dir"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: triplestore
labels:
app: demo
role: triplestore
spec:
selector:
matchLabels:
app: demo
role: triplestore
replicas: 1
template:
metadata:
labels:
app: demo
role: triplestore
spec:
containers:
- name: triplestore
image: eclipse/rdf4j-workbench:amd64-3.5.0
imagePullPolicy: Always
ports:
- name: http
protocol: TCP
containerPort: 8080
resources:
requests:
cpu: 100m
memory: 200Mi
volumeMounts:
- name: storage
mountPath: /var/rdf4j
initContainers:
- name: take-data-dir-ownership
image: eclipse/rdf4j-workbench:amd64-3.5.0
command:
- chown
- -R
- 100:65533
- /var/rdf4j
volumeMounts:
- name: storage
mountPath: /var/rdf4j
volumes:
- name: storage
persistentVolumeClaim:
claimName: "triplestore-data-storage"
kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
triplestore-data-storage Bound triplestore-data-storage-dir 10Gi RWX local-storage 13s
kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
triplestore-data-storage-dir 10Gi RWX Delete Bound default/triplestore-data-storage local-storage 17s
kubectl get events
LAST SEEN TYPE REASON OBJECT MESSAGE
21s Warning FailedScheduling pod/triplestore-6d6876f49-2s84c 0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.
19s Normal Scheduled pod/triplestore-6d6876f49-2s84c Successfully assigned default/triplestore-6d6876f49-2s84c to docker-desktop
3s Normal Pulled pod/triplestore-6d6876f49-2s84c Container image "eclipse/rdf4j-workbench:amd64-3.5.0" already present on machine
3s Normal Created pod/triplestore-6d6876f49-2s84c Created container take-data-dir-ownership
3s Normal Started pod/triplestore-6d6876f49-2s84c Started container take-data-dir-ownership
2s Warning BackOff pod/triplestore-6d6876f49-2s84c Back-off restarting failed container
46m Normal Pulled pod/triplestore-6d6876f49-9n5kt Container image "eclipse/rdf4j-workbench:amd64-3.5.0" already present on machine
79s Warning BackOff pod/triplestore-6d6876f49-9n5kt Back-off restarting failed container
21s Normal SuccessfulCreate replicaset/triplestore-6d6876f49 Created pod: triplestore-6d6876f49-2s84c
21s Normal ScalingReplicaSet deployment/triplestore Scaled up replica set triplestore-6d6876f49 to 1
kubectl describe pods/triplestore-6d6876f49-tw8r8
Name: triplestore-6d6876f49-tw8r8
Namespace: default
Priority: 0
Node: docker-desktop/192.168.65.4
Start Time: Mon, 17 Jan 2022 10:17:20 -0500
Labels: app=demo
pod-template-hash=6d6876f49
role=triplestore
Annotations: <none>
Status: Pending
IP: 10.1.2.133
IPs:
IP: 10.1.2.133
Controlled By: ReplicaSet/triplestore-6d6876f49
Init Containers:
take-data-dir-ownership:
Container ID: docker://89e7b1e3ae76c30180ee5083624e1bf5f30b55fd95bf1c24422fabe41ae74408
Image: eclipse/rdf4j-workbench:amd64-3.5.0
Image ID: docker-pullable://registry.com/publicrepos/docker_cache/eclipse/rdf4j-workbench#sha256:14621ad610b0d0269dedd9939ea535348cc6c147f9bd47ba2039488b456118ed
Port: <none>
Host Port: <none>
Command:
chown
-R
100:65533
/var/rdf4j
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Mon, 17 Jan 2022 10:22:59 -0500
Finished: Mon, 17 Jan 2022 10:22:59 -0500
Ready: False
Restart Count: 6
Environment: <none>
Mounts:
/var/rdf4j from storage (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-s8wdv (ro)
Containers:
triplestore:
Container ID:
Image: eclipse/rdf4j-workbench:amd64-3.5.0
Image ID:
Port: 8080/TCP
Host Port: 0/TCP
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Requests:
cpu: 100m
memory: 200Mi
Environment: <none>
Mounts:
/var/rdf4j from storage (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-s8wdv (ro)
Conditions:
Type Status
Initialized False
Ready False
ContainersReady False
PodScheduled True
Volumes:
storage:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: triplestore-data-storage
ReadOnly: false
kube-api-access-s8wdv:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 6m24s default-scheduler 0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.
Normal Scheduled 6m13s default-scheduler Successfully assigned default/triplestore-6d6876f49-tw8r8 to docker-desktop
Normal Pulled 4m42s (x5 over 6m12s) kubelet Container image "eclipse/rdf4j-workbench:amd64-3.5.0" already present on machine
Normal Created 4m42s (x5 over 6m12s) kubelet Created container take-data-dir-ownership
Normal Started 4m42s (x5 over 6m12s) kubelet Started container take-data-dir-ownership
Warning BackOff 70s (x26 over 6m10s) kubelet Back-off restarting failed container
Solution
As it turns out the problem was that the initContainer wasn't running as root, it was running as the default user of the container, and so didn't have the permissions to run the chown command. In the linked SO comment, this was the first comment to the answer, with the response being that the initContainer ran as root - this has apparently changed in newer versions of kubernetes. There is a solution though, you can set the securityContext on the container to run as root, giving it permission to run the chown command, and that successfully allows the volume to be mounted as a non-root user. Here's the final configuration of the initContainer.
initContainers:
- name: take-data-dir-ownership
image: eclipse/rdf4j-workbench:amd64-3.5.0
securityContext:
runAsUser: 0
command:
- chown
- -R
- 100:65533
- /var/rdf4j
volumeMounts:
- name: storage
mountPath: /var/rdf4j
1 pod has unbound immediate PersistentVolumeClaims. - this error means the pod cannot bound to the PVC on the node where it has been scheduled to run on. This can happen when the PVC bounded to a PV that refers to a location that is not valid on the node that the pod is scheduled to run on. It will be helpful if you can post the complete output of kubectl get nodes -o wide, kubectl describe pvc triplestore-data-storage, kubectl describe pv triplestore-data-storage-dir to the question.
The mean time, PVC/PV is optional when using hostPath, can you try the following spec and see if the pod can come online:
apiVersion: apps/v1
kind: Deployment
metadata:
name: triplestore
labels:
app: demo
role: triplestore
spec:
selector:
matchLabels:
app: demo
role: triplestore
replicas: 1
template:
metadata:
labels:
app: demo
role: triplestore
spec:
containers:
- name: triplestore
image: eclipse/rdf4j-workbench:amd64-3.5.0
imagePullPolicy: IfNotPresent
ports:
- name: http
protocol: TCP
containerPort: 8080
resources:
requests:
cpu: 100m
memory: 200Mi
volumeMounts:
- name: storage
mountPath: /var/rdf4j
initContainers:
- name: take-data-dir-ownership
image: eclipse/rdf4j-workbench:amd64-3.5.0
imagePullPolicy: IfNotPresent
securityContext:
runAsUser: 0
command:
- chown
- -R
- 100:65533
- /var/rdf4j
volumeMounts:
- name: storage
mountPath: /var/rdf4j
volumes:
- name: storage
hostPath:
path: /run/desktop/mnt/host/d/workdir/k8s-data/triplestore
type: DirectoryOrCreate
While learning Kubernetes going by the book Kubernetes for developer, I am stuck at this point now.
I am trying to start Rabbitmq pod but but after lot of troubleshooting I have managed to get to this point but do not get clue where do I fix to get rid of the permission denied error.
# kubectl get pods
NAME READY STATUS RESTARTS AGE
rabbitmq-56c67d8d7d-s8vp5 0/1 CrashLoopBackOff 5 5m40s
if I look at the logs of this contianer thats where I found:
# kubectl logs rabbitmq-56c67d8d7d-s8vp5
21:22:58.49
21:22:58.50 Welcome to the Bitnami rabbitmq container
21:22:58.51 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-rabbitmq
21:22:58.51 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-rabbitmq/issues
21:22:58.52 Send us your feedback at containers#bitnami.com
21:22:58.52
21:22:58.52 INFO ==> ** Starting RabbitMQ setup **
21:22:58.54 INFO ==> Validating settings in RABBITMQ_* env vars..
21:22:58.56 INFO ==> Initializing RabbitMQ...
21:22:58.57 INFO ==> Generating random cookie
mkdir: cannot create directory ‘/bitnami/rabbitmq’: Permission denied
Here is my rabbitmq-deployment.yml
---
# EXPORT SERVICE INTERFACE
kind: Service
apiVersion: v1
metadata:
name: message-queue
labels:
app: rabbitmq
role: master
tier: queue
spec:
ports:
- port: 5672
targetPort: 5672
selector:
app: rabbitmq
role: master
tier: queue
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: rabbitmq-pv-claim
labels:
app: rabbitmq
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: rabbitmq
spec:
replicas: 1
selector:
matchLabels:
app: rabbitmq
role: master
tier: queue
template:
metadata:
labels:
app: rabbitmq
role: master
tier: queue
spec:
nodeSelector:
boardType: x86vm
containers:
- name: rabbitmq
image: bitnami/rabbitmq:3.7
envFrom:
- configMapRef:
name: bitnami-rabbitmq-config
ports:
- name: queue
containerPort: 5672
- name: queue-mgmt
containerPort: 15672
livenessProbe:
exec:
command:
- rabbitmqctl
- status
initialDelaySeconds: 120
timeoutSeconds: 5
failureThreshold: 6
readinessProbe:
exec:
command:
- rabbitmqctl
- status
initialDelaySeconds: 10
timeoutSeconds: 3
periodSeconds: 5
volumeMounts:
- name: rabbitmq-storage
mountPath: /bitnami
volumes:
- name: rabbitmq-storage
persistentVolumeClaim:
claimName: rabbitmq-pv-claim
This is the rabbitmq-storage-class.yml
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: rabbitmq-storage-class
labels:
app: rabbitmq
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
and persistant-volume.yml
apiVersion: v1
kind: PersistentVolume
metadata:
name: rabbitmq-pv-claim
labels:
app: rabbitmq
spec:
storageClassName: manual
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
hostPath:
path: /bitnami
Logs:
# kubectl describe pods rabbitmq-5f7f787479-fpg6g
Name: rabbitmq-5f7f787479-fpg6g
Namespace: default
Priority: 0
Node: kube-worker-vm2/192.168.1.36
Start Time: Mon, 03 May 2021 12:29:17 +0100
Labels: app=rabbitmq
pod-template-hash=5f7f787479
role=master
tier=queue
Annotations: cni.projectcalico.org/podIP: 192.168.222.4/32
cni.projectcalico.org/podIPs: 192.168.222.4/32
Status: Running
IP: 192.168.222.4
IPs:
IP: 192.168.222.4
Controlled By: ReplicaSet/rabbitmq-5f7f787479
Containers:
rabbitmq:
Container ID: docker://bbdbb9c5d4b6737519d3dcf4bdda242a7fe904f2336334afe686e9b204fd6d5c
Image: bitnami/rabbitmq:3.7
Image ID: docker-pullable://bitnami/rabbitmq#sha256:8b6057997b74ebc81e934dd6c94e9da745635faa2d79b382cfda27b9176e0e6d
Ports: 5672/TCP, 15672/TCP
Host Ports: 0/TCP, 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Mon, 03 May 2021 12:30:48 +0100
Finished: Mon, 03 May 2021 12:30:48 +0100
Ready: False
Restart Count: 4
Liveness: exec [rabbitmqctl status] delay=120s timeout=5s period=10s #success=1 #failure=6
Readiness: exec [rabbitmqctl status] delay=10s timeout=3s period=5s #success=1 #failure=3
Environment Variables from:
bitnami-rabbitmq-config ConfigMap Optional: false
Environment: <none>
Mounts:
/bitnami from rabbitmq-storage (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-4qmxr (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
rabbitmq-storage:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: rabbitmq-pv-claim
ReadOnly: false
default-token-4qmxr:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-4qmxr
Optional: false
QoS Class: BestEffort
Node-Selectors: boardType=x86vm
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m20s default-scheduler Successfully assigned default/rabbitmq-5f7f787479-fpg6g to kube-worker-vm2
Normal Created 96s (x4 over 2m18s) kubelet Created container rabbitmq
Normal Started 95s (x4 over 2m17s) kubelet Started container rabbitmq
Warning
BackOff 65s (x12 over 2m16s) kubelet Back-off restarting failed container
Normal Pulled 50s (x5 over 2m18s) kubelet Container image "bitnami/rabbitmq:3.7" already present on machine
When creating an image, the image creator often chooses to use a user other than root to run the process. This is the case for your image, and the user does not have write permissions on the /bitnami directory. You can verify this by commenting out the volume.
To fix the issue, you need to set a security contect for your pod: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod
Not sure about the exact syntax, but something like this should do the trick:
spec:
securityContext:
fsGroup: 1001 # the userid that is used in the image
nodeSelector:
boardType: x86vm
containers:
- name: rabbitmq
image: bitnami/rabbitmq:3.7
envFrom:
- configMapRef:
name: bitnami-rabbitmq-config
This makes the directory writeable by the user in the image.
Another thing: A deployment is for stateless services by design. If you have state to keep, always use a statefulset. It's very similiar to a deployment from a configuration point of view, but Kubernetes treats it very differently. See https://www.youtube.com/watch?v=Vrxr-7rjkvM for good explanation.
As per bitnami documentation, it depends on the kubernetes distribution
Quote from documentation
Adjust permissions of persistent volume mountpoint
As the image run as non-root by default, it is necessary to adjust the ownership of the persistent volume so that the container can write data into it.
By default, the chart is configured to use Kubernetes Security Context to automatically change the ownership of the volume. However, this feature does not work in all Kubernetes distributions. As an alternative, this chart supports using an initContainer to change the ownership of the volume before mounting it in the final destination.
You can enable this initContainer by setting volumePermissions.enabled to true.
I am trying to use persistent volume claims and facing this issue
This is my postgres-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres-deployment
spec:
replicas: 1
selector:
matchLabels:
component: postgres
template:
metadata:
labels:
component: postgres
spec:
volumes:
- name: postgres-storage
persistentVolumeClaim:
claimName: database-persistent-volume-claim
containers:
- name: postgres
image: postgres
ports:
- containerPort: 5432
volumeMounts:
- mountPath: /var/lib/postgresql/data
name: postgres-storage
subPath: postgres
when i debug pod using describe
kubectl describe pod postgres-deployment-8576df7bfc-8mp5t
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 3m4s default-scheduler Successfully assigned default/postgres-deployment-8576df7bfc-8mp5t to docker-desktop
Normal Pulled 67s (x8 over 2m58s) kubelet, docker-desktop Successfully pulled image "postgres"
Warning Failed 67s (x8 over 2m58s) kubelet, docker-desktop Error: failed to prepare subPath for volumeMount "postgres-storage" of container "postgres"
Normal Pulling 53s (x9 over 3m3s) kubelet, docker-desktop Pulling image "postgres"
My pod is showing me this error
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
postgres-deployment-8576df7bfc-8mp5t 0/1 CreateContainerConfigError 0 5m5
I am not sure where is the problem in the config. but I am sure this is related to volumes because after adding volumes this problem appears
remove subpath. can you try below yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres-deployment
spec:
replicas: 1
selector:
matchLabels:
component: postgres
template:
metadata:
labels:
component: postgres
spec:
volumes:
- name: postgres-storage
persistentVolumeClaim:
claimName: database-persistent-volume-claim
containers:
- name: postgres
image: postgres
ports:
- containerPort: 5432
volumeMounts:
- mountPath: /var/lib/postgresql/data
name: postgres-storage
I just deployed and it works
master $ kubectl get deploy
NAME READY UP-TO-DATE AVAILABLE AGE
postgres-deployment 1/1 1 1 4m13s
master $ kubectl get po
NAME READY STATUS RESTARTS AGE
postgres-deployment-6b66bdd748-5q76h 1/1 Running 0 4m13s