RabbitMQ Install - pod has unbound immediate PersistentVolumeClaims - kubernetes

I am trying to do a install of RabbitMQ in Kubernetes and following the entry on the RabbitMQ site https://www.rabbitmq.com/blog/2020/08/10/deploying-rabbitmq-to-kubernetes-whats-involved/.
Please note my CentOS 7 and Kubernetes 1.18. Also, I am not even sure this is the best way to deploy RabbitMQ, its the best documentation I could find though. I did find something that said that volumeClaimTemplates does not support NFS so I am wondering if that is the issue.
I have added the my Persistent Volume using NFS:
kind: PersistentVolume
apiVersion: v1
metadata:
name: rabbitmq-nfs-pv
namespace: ninegold-rabbitmq
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
nfs:
path: /var/nfsshare
server: 192.168.1.241
persistentVolumeReclaimPolicy: Retain
It created the PV correctly.
[admin#centos-controller ~]$ kubectl get pv -n ninegold-rabbitmq
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
ninegold-platform-custom-config-br 1Gi RWX Retain Bound ninegold-platform/ninegold-db-pgbr-repo 22d
ninegold-platform-custom-config-pgadmin 1Gi RWX Retain Bound ninegold-platform/ninegold-db-pgadmin 21d
ninegold-platform-custom-config-pgdata 1Gi RWX Retain Bound ninegold-platform/ninegold-db 22d
rabbitmq-nfs-pv 5Gi RWO Retain Available 14h
I then add my StatefulSet.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: rabbitmq
namespace: ninegold-rabbitmq
spec:
selector:
matchLabels:
app: "rabbitmq"
# headless service that gives network identity to the RMQ nodes, and enables them to cluster
serviceName: rabbitmq-headless # serviceName is the name of the service that governs this StatefulSet. This service must exist before the StatefulSet, and is responsible for the network identity of the set. Pods get DNS/hostnames that follow the pattern: pod-specific-string.serviceName.default.svc.cluster.local where "pod-specific-string" is managed by the StatefulSet controller.
volumeClaimTemplates:
- metadata:
name: rabbitmq-data
namespace: ninegold-rabbitmq
spec:
storageClassName: local-storage
accessModes:
- ReadWriteOnce
resources:
requests:
storage: "5Gi"
template:
metadata:
name: rabbitmq
namespace: ninegold-rabbitmq
labels:
app: rabbitmq
spec:
initContainers:
# Since k8s 1.9.4, config maps mount read-only volumes. Since the Docker image also writes to the config file,
# the file must be mounted as read-write. We use init containers to copy from the config map read-only
# path, to a read-write path
- name: "rabbitmq-config"
image: busybox:1.32.0
volumeMounts:
- name: rabbitmq-config
mountPath: /tmp/rabbitmq
- name: rabbitmq-config-rw
mountPath: /etc/rabbitmq
command:
- sh
- -c
# the newline is needed since the Docker image entrypoint scripts appends to the config file
- cp /tmp/rabbitmq/rabbitmq.conf /etc/rabbitmq/rabbitmq.conf && echo '' >> /etc/rabbitmq/rabbitmq.conf;
cp /tmp/rabbitmq/enabled_plugins /etc/rabbitmq/enabled_plugins
volumes:
- name: rabbitmq-config
configMap:
name: rabbitmq-config
optional: false
items:
- key: enabled_plugins
path: "enabled_plugins"
- key: rabbitmq.conf
path: "rabbitmq.conf"
# read-write volume into which to copy the rabbitmq.conf and enabled_plugins files
# this is needed since the docker image writes to the rabbitmq.conf file
# and Kubernetes Config Maps are mounted as read-only since Kubernetes 1.9.4
- name: rabbitmq-config-rw
emptyDir: {}
- name: rabbitmq-data
persistentVolumeClaim:
claimName: rabbitmq-data
serviceAccount: rabbitmq
# The Docker image runs as the `rabbitmq` user with uid 999
# and writes to the `rabbitmq.conf` file
# The security context is needed since the image needs
# permission to write to this file. Without the security
# context, `rabbitmq.conf` is owned by root and inaccessible
# by the `rabbitmq` user
securityContext:
fsGroup: 999
runAsUser: 999
runAsGroup: 999
containers:
- name: rabbitmq
# Community Docker Image
image: rabbitmq:latest
volumeMounts:
# mounting rabbitmq.conf and enabled_plugins
# this should have writeable access, this might be a problem
- name: rabbitmq-config-rw
mountPath: "/etc/rabbitmq"
# mountPath: "/etc/rabbitmq/conf.d/"
mountPath: "/var/lib/rabbitmq"
# rabbitmq data directory
- name: rabbitmq-data
mountPath: "/var/lib/rabbitmq/mnesia"
env:
- name: RABBITMQ_DEFAULT_PASS
valueFrom:
secretKeyRef:
name: rabbitmq-admin
key: pass
- name: RABBITMQ_DEFAULT_USER
valueFrom:
secretKeyRef:
name: rabbitmq-admin
key: user
- name: RABBITMQ_ERLANG_COOKIE
valueFrom:
secretKeyRef:
name: erlang-cookie
key: cookie
ports:
- name: amqp
containerPort: 5672
protocol: TCP
- name: management
containerPort: 15672
protocol: TCP
- name: prometheus
containerPort: 15692
protocol: TCP
- name: epmd
containerPort: 4369
protocol: TCP
livenessProbe:
exec:
# This is just an example. There is no "one true health check" but rather
# several rabbitmq-diagnostics commands that can be combined to form increasingly comprehensive
# and intrusive health checks.
# Learn more at https://www.rabbitmq.com/monitoring.html#health-checks.
#
# Stage 2 check:
command: ["rabbitmq-diagnostics", "status"]
initialDelaySeconds: 60
# See https://www.rabbitmq.com/monitoring.html for monitoring frequency recommendations.
periodSeconds: 60
timeoutSeconds: 15
readinessProbe: # probe to know when RMQ is ready to accept traffic
exec:
# This is just an example. There is no "one true health check" but rather
# several rabbitmq-diagnostics commands that can be combined to form increasingly comprehensive
# and intrusive health checks.
# Learn more at https://www.rabbitmq.com/monitoring.html#health-checks.
#
# Stage 1 check:
command: ["rabbitmq-diagnostics", "ping"]
initialDelaySeconds: 20
periodSeconds: 60
timeoutSeconds: 10
However my stateful set is not binding, I am getting the following error:
running "VolumeBinding" filter plugin for pod "rabbitmq-0": pod has unbound immediate PersistentVolumeClaims
The PVC did not correctly bind to the PV but stays in pending state.
[admin#centos-controller ~]$ kubectl get pvc -n ninegold-rabbitmq
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
rabbitmq-data-rabbitmq-0 Pending local-storage 14h
I have double checked the capacity, accessModes, I am not sure why this is not binding.
My example came from here https://github.com/rabbitmq/diy-kubernetes-examples/tree/master/gke, the only changes I have done is to bind my NFS volume.
Any help would be appreciated.

In your YAMLs I found some misconfigurations.
local-storage class.
I assume, you used Documentation example to create local-storage. There is information that:
Local volumes do not currently support dynamic provisioning, however a StorageClass should still be created to delay volume binding until Pod scheduling.
When you want to use volumeClaimTemplates, you will use Dynamic Provisioning. It's well explained in Medium article.
PV in StatefulSet
Specifically to the volume part, StatefulSet provides a key named as volumeClaimTemplates. With that, you can request the PVC from the storage class dynamically. As part of your new statefulset app definition, replace the volumes ... The PVC is named as volumeClaimTemplate name + pod-name + ordinal number.
As local-storage does not support dynamic provisioning, it will not work. You would need to use NFS storageclass with proper provisioner or create PV manually.
Also, when you are using volumeClaimTemplates for each pod it will create Pv and PVC. In addition PVC and PV are bounding in 1:1 relationship. For more details you can check this SO thread.
Error unbound immediate PersistentVolumeClaims
It means that dynamic provisioning didn't work as expected. If you would check kubectl get pv,pvc you would not see any new PV,PVC with name: volumeClaimTemplate name + pod-name + ordinal number.
claimName: rabbitmq-data
I assume, in this claim you wanted to mount it to PV created by volumeClaimTemplates but it was not created. Also PV would have name rabbitmq-data-rabbitmq-0 for first pod and rabbitmq-data-rabbitmq-1 for the second one.
As last part, this article - Kubernetes : NFS and Dynamic NFS provisioning might be helpful.

Related

0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims

As the documentation states:
For each VolumeClaimTemplate entry defined in a StatefulSet, each Pod
receives one PersistentVolumeClaim. In the nginx example above, each
Pod receives a single PersistentVolume with a StorageClass of
my-storage-class and 1 Gib of provisioned storage. If no StorageClass
is specified, then the default StorageClass will be used. When a Pod
is (re)scheduled onto a node, its volumeMounts mount the
PersistentVolumes associated with its PersistentVolume Claims. Note
that, the PersistentVolumes associated with the Pods' PersistentVolume
Claims are not deleted when the Pods, or StatefulSet are deleted. This
must be done manually.
The part I'm interested in is this: If no StorageClassis specified, then the default StorageClass will be used
I create a StatefulSet like this:
apiVersion: apps/v1
kind: StatefulSet
metadata:
namespace: ches
name: ches
spec:
serviceName: ches
replicas: 1
selector:
matchLabels:
app: ches
template:
metadata:
labels:
app: ches
spec:
serviceAccountName: ches-serviceaccount
nodeSelector:
ches-worker: "true"
volumes:
- name: data
hostPath:
path: /data/test
containers:
- name: ches
image: [here I have the repo]
imagePullPolicy: Always
securityContext:
privileged: true
args:
- server
- --console-address
- :9011
- /data
env:
- name: MINIO_ACCESS_KEY
valueFrom:
secretKeyRef:
name: ches-keys
key: access-key
- name: MINIO_SECRET_KEY
valueFrom:
secretKeyRef:
name: ches-keys
key: secret-key
ports:
- containerPort: 9000
hostPort: 9011
resources:
limits:
cpu: 100m
memory: 200Mi
volumeMounts:
- name: data
mountPath: /data
imagePullSecrets:
- name: edge-storage-token
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
Of course I have already created the secrets, imagePullSecrets etc and I have labeled the node as ches-worker.
When I apply the yaml file, the pod is in Pending status and kubectl describe pod ches-0 -n ches gives the following error:
Warning FailedScheduling 6s default-scheduler 0/1 nodes are
available: 1 pod has unbound immediate PersistentVolumeClaims.
preemption: 0/1 nodes are available: 1 Preemption is not helpful for
scheduling
Am I missing something here?
You need to create a PV in order to get a PVC bound. If you want the PVs automatically created from PVC claims you need a Provisioner installed in your Cluster.
First create a PV with at least the amout of space need by your PVC.
Then you can apply your deployment yaml which contains the PVC claim.
K3s when installed, also downloads a storage class which makes it as default.
Check with kubectl get storageclass:
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE
ALLOWVOLUMEEXPANSION AGE local-path rancher.io/local-path Delete
WaitForFirstConsumer false 8s
K8s cluster on the other hand, does not download also a default storage class.
In order to solve the problem:
Download rancher.io/local-path storage class:
kubectl apply -f
https://raw.githubusercontent.com/rancher/local-path-provisioner/master/deploy/local-path-storage.yaml
Check with kubectl get storageclass
Make this storage class (local-path) the default:
kubectl patch
storageclass local-path -p '{"metadata":
{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

Kubernetes pod fails with "Unable to attach or mount volumes"

After Kubernetes upgrade from 1.18.13 to 1.19.5 I get error bellow for some pods randomly. After some time pod fails to start(it's a simple pod, doesn't belong to deployment)
Warning FailedMount 99s kubelet Unable to attach or mount volumes: unmounted volumes=[red-tmp data logs docker src red-conf], unattached volumes=[red-tmp data logs docker src red-conf]: timed out waiting for the condition
On 1.18 we don't have such issue, also during upgrade K8S doesn't show any errors or incompatibility messages.
No additional logs from any other K8S components(tried to increase verbosity level for kubelet)
Everything is fine with disk space and other host's metrics like LA, RAM
No network storages, only local data
PV and PVC are created before pods and we don't change them
Tried to use higher K8S versions but no luck
We have pretty standard setup without any special customizations:
CNI: Flannel
CRI: Docker
Only one node as master and worker
16 cores and 32G RAM
Example of pod definition:
apiVersion: v1
kind: Pod
metadata:
labels:
app: provision
ver: latest
name: provision
namespace: red
spec:
containers:
- args:
- wait
command:
- provision.sh
image: app-tests
imagePullPolicy: IfNotPresent
name: provision
volumeMounts:
- mountPath: /opt/app/be
name: src
- mountPath: /opt/app/be/conf
name: red-conf
- mountPath: /opt/app/be/tmp
name: red-tmp
- mountPath: /var/lib/app
name: data
- mountPath: /var/log/app
name: logs
- mountPath: /var/run/docker.sock
name: docker
dnsConfig:
options:
- name: ndots
value: "2"
dnsPolicy: ClusterFirst
enableServiceLinks: false
restartPolicy: Never
volumes:
- hostPath:
path: /opt/agent/projects/app-backend
type: Directory
name: src
- name: red-conf
persistentVolumeClaim:
claimName: conf
- name: red-tmp
persistentVolumeClaim:
claimName: tmp
- name: data
persistentVolumeClaim:
claimName: data
- name: logs
persistentVolumeClaim:
claimName: logs
- hostPath:
path: /var/run/docker.sock
type: Socket
name: docker
PV
apiVersion: v1
kind: PersistentVolume
metadata:
name: red-conf
labels:
namespace: red
spec:
accessModes:
- ReadWriteMany
capacity:
storage: 2Gi
hostPath:
path: /var/lib/docker/k8s/red-conf
persistentVolumeReclaimPolicy: Retain
storageClassName: red-conf
volumeMode: Filesystem
PVC
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: conf
namespace: red
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 2Gi
storageClassName: red-conf
volumeMode: Filesystem
volumeName: red-conf
tmp data logs pv have the same setup as conf beside path. They have separate folders:
/var/lib/docker/k8s/red-tmp
/var/lib/docker/k8s/red-data
/var/lib/docker/k8s/red-logs
Currently I don't have any clues how to diagnose the issue :(
Would be glad to get advice. Thanks in advance.
you must be using local volumes. Follow the below link to understand how to create storage class, pv's and pvc when you use local volumes.
https://kubernetes.io/blog/2019/04/04/kubernetes-1.14-local-persistent-volumes-ga/
I recommend you to start troubleshooting by reviewing the VolumeAttachment events against what node has tied the PV, perhaps your volume is still linked to a node that was in evicted condition and was replaced by a new one.
You can use this command to check your PV name and status:
kubectl get pv
And then, to review what node has the correct volumeattachment, you can use the following command:
kubectl get volumeattachment
Once you get the name of your PV and at what node it is attached, then you will be able to see if the PV is tied to the correct node or if maybe it is tied to a previous node that is not working or was removed. The node gets evicted and scheduled into a new available node from the pool; to know what nodes are ready and running, you can use this command:
kubectl get nodes
If you detect that your PV is tied to the node that no longer exists, you will need to delete the VolumeAttachment with the following command:
kubectl delete volumeattachment [csi-volumeattachment_name]
If you need to review a detailed guide for this troubleshooting, you can follow this link.

Kubernetes fsGroup not changing file ownership on PersistentVolume

On the host, everything in the mounted directory (/opt/testpod) is owned by uid=0 gid=0. I need those files to be owned by whatever the container decides, i.e. a different gid, to be able to write there. Resources I'm testing with:
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv
labels:
name: pv
spec:
storageClassName: manual
capacity:
storage: 10Mi
accessModes:
- ReadWriteOnce
hostPath:
path: "/opt/testpod"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc
spec:
storageClassName: manual
selector:
matchLabels:
name: pv
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Mi
---
apiVersion: v1
kind: Pod
metadata:
name: testpod
spec:
nodeSelector:
foo: bar
securityContext:
runAsUser: 500
runAsGroup: 500
fsGroup: 500
volumes:
- name: vol
persistentVolumeClaim:
claimName: pvc
containers:
- name: testpod
image: busybox
command: [ "sh", "-c", "sleep 1h" ]
volumeMounts:
- name: vol
mountPath: /data
After the pod is running, I kubectl exec into it and ls -la /data shows everything still owned by gid=0. According to some Kuber docs, fsGroup is supposed to chown everything on the pod start but it doesn't happen. What am I doing wrong please?
The hostpath type PV doesn't support security context. You have to be root for the volume to be written in. It is described well in this github issue and this docs about hostPath:
The directories created on the underlying hosts are only writable by root. You either need to run your process as root in a privileged
container or modify the file permissions on the host to be able to write to a
hostPath volume
You may also want to check this github request describing why changing permission of host directory is dangerous.
The workaround people describe that it appears to be working is to grant your user sudo privileges but that actually makes the idea of running container as non root user useless.
Security context appears to be working well with emptyDir volume (described well in the k8s docs here)

Statefulset with replicas : 1 pod has unbound immediate PersistentVolumeClaims

I'm trying to setup , in my single node cluster (Docker Desktop Windows), an elastic cluster.
For this, I have created the PV as followed (working)
apiVersion: v1
kind: PersistentVolume
metadata:
name: elastic-pv-data
labels:
type: local
spec:
storageClassName: elasticdata
accessModes:
- ReadWriteOnce
capacity:
storage: 20Gi
hostPath:
path: "/mnt/data/elastic"
Then here is the configuration :
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: esnode
spec:
selector:
matchLabels:
app: es-cluster # has to match .spec.template.metadata.labels
serviceName: elasticsearch
replicas: 2
updateStrategy:
type: RollingUpdate
template:
metadata:
labels:
app: es-cluster
spec:
securityContext:
fsGroup: 1000
initContainers:
- name: init-sysctl
image: busybox
imagePullPolicy: IfNotPresent
securityContext:
privileged: true
command: ["sysctl", "-w", "vm.max_map_count=262144"]
containers:
- name: elasticsearch
resources:
requests:
memory: 1Gi
securityContext:
privileged: true
runAsUser: 1000
capabilities:
add:
- IPC_LOCK
- SYS_RESOURCE
image: docker.elastic.co/elasticsearch/elasticsearch-oss:7.7.1
env:
- name: ES_JAVA_OPTS
valueFrom:
configMapKeyRef:
name: es-config
key: ES_JAVA_OPTS
readinessProbe:
httpGet:
scheme: HTTP
path: /_cluster/health?local=true
port: 9200
initialDelaySeconds: 5
ports:
- containerPort: 9200
name: es-http
- containerPort: 9300
name: es-transport
volumeMounts:
- name: es-data
mountPath: /usr/share/elasticsearch/data
volumeClaimTemplates:
- metadata:
name: es-data
spec:
storageClassName: elasticdata
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3Gi
And the result is only one "pod" has its pvc binded to the pv, the other one gets an error loop "0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims".
Here is the kubectl get pv,pvc result :
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/elastic-pv-data 20Gi RWO Retain Bound default/es-data-esnode-0 elasticdata 14m
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/es-data-esnode-0 Bound elastic-pv-data 20Gi RWO elasticdata 13m
If I undestood correctly, I should have a second persistantolumeclaim with the following identifier : es-data-esnode-1
Is there something I miss or do not understand correctly ?
Thanks for your help
I skip here the non relevant parts (configmap,loadbalancer,..)
Let me add a few details to what was already said both in comments and in Jonas's answer.
Inferring from the comments, you've not defined a StorageClass named elasticdata. If it doesn't exist, you cannot reference it in your PV and PVC.
Take a quick look at how hostPath is used to define a PersistentVolume and how it is referenced in a PersistentVolumeClaim. Here you can see that in the example storageClassName: manual is used. Kubernetes docs doesn't say it explicitely but if you take a look at Openshift docs, it says very clearly that:
A Pod that uses a hostPath volume must be referenced by manual
(static) provisioning.
It's not just some value used to bind PVC request to this specific PV. So if the elasticdata StorageClass hasn't been defined, you should't use it here.
Second thing. As Jonas already stated in his comment, there is one-to-one binding between PVC and PV so no matter that your PV still has enough capacity, it has been already claimed by a different PVC and is not available any more. As you can read in the official docs:
A PVC to PV binding is a one-to-one mapping, using a ClaimRef which is
a bi-directional binding between the PersistentVolume and the
PersistentVolumeClaim.
Claims will remain unbound indefinitely if a matching volume does not
exist. Claims will be bound as matching volumes become available. For
example, a cluster provisioned with many 50Gi PVs would not match a
PVC requesting 100Gi. The PVC can be bound when a 100Gi PV is added to
the cluster.
And vice versa. If there is just one 100Gi PV it won't be able to safisfy a request from two PVCs claiming for 50Gi each. Note that in the result of kubectl get pv,pvc you posted, both PV and PVC have capacity of 20Gi although you request in each PVC created from PVC template only 3Gi.
You don't work here with any dynamic storage provisioner so you need to provide manually as many PersistentVolumes as needed for your use case.
By the way, instead of using hostPath I would rather recommend you using local volume with properly defined StorageClass. It has a few advantages over HostPath. Additionally an external static provisioner can be run separately for improved management of the local volume lifecycle
When using a StatefulSet with volumeClaimTemplates, it will create a PersistentVolumeClaim for each replica. So if you use replicas: 2, two different PersistentVolumeClaims will be created, es-data-esnode-0 and es-data-esnode-1.
Each PersistentVolumeClaim will bound to an unique PersistentVolume, so in the case of two PVCs, you would need two different PersistentVolumes. But this is not easy to do using volumeClaimTemplate and hostPath-volumes in a desktop setup.
For what reasons do you need replicas: 2 in this case? It is usually used to provide better availability, e.g. using more than one node. But for a local setup in a desktop environment, usually a single replica on the single node should be fine? I think the easies solution for you is to use replicas: 1.

Kubernetes - For Scale, pod is pending when attached the persistent volumes while scaling the pod (GKE)

I have created a deployment in the xyz-namespace namespace, it has PVC. I can create the deployment and able to access it. It is working properly but while scale the deployment from the Kubernetes console then the pod is pending state only.
persistent_claim:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: jenkins
spec:
accessModes:
- ReadWriteOnce
storageClassName: standard
resources:
requests:
storage: 5Gi
namespace: xyz-namespace
and deployment object is like below.
apiVersion: apps/v1
kind: Deployment
metadata:
name: db-service
labels:
k8s-app: db-service
Name:db-service
ServiceName: db-service
spec:
selector:
matchLabels:
tier: data
Name: db-service
ServiceName: db-service
strategy:
type: Recreate
template:
metadata:
labels:
app: jenkins
tier: data
Name: db-service
ServiceName: db-service
spec:
hostname: jenkins
initContainers:
- command:
- "/bin/sh"
- "-c"
- chown -R 1000:1000 /var/jenkins_home
image: busybox
imagePullPolicy: Always
name: jenkins-init
volumeMounts:
- name: jenkinsvol
mountPath: "/var/jenkins_home"
containers:
- image: jenkins/jenkins:lts
name: jenkins
ports:
- containerPort: 8080
name: jenkins1
- containerPort: 8080
name: jenkins2
volumeMounts:
- name: jenkinsvol
mountPath: "/var/jenkins_home"
volumes:
- name: jenkinsvol
persistentVolumeClaim:
claimName: jenkins
nodeSelector:
nodegroup: xyz-testing
namespace: xyz-namespace
replicas: 1
Deployment is created fine and working as well but
When I am trying to Scale the deployment from console then the pod is getting stuck and it's pending state only.
If I removed the persistent volume and then scaled it then it is working fine, but with persistent volume, it is not working.
When using standard storage class I assume you are using the default GCEPersisentDisk Volume PlugIn. In this case you cannot set them at all as they are already set by the storage provider (GCP in your case, as you are using GCE perisistent disks), these disks only support ReadWriteOnce(RWO) and ReadOnlyMany (ROX) access modes. If you try to create a ReadWriteMany(RWX) PV that will never come in a success state (your case when set the PVC with accessModes: ReadWriteMany).
Also if any pod tries to attach a ReadWriteOnce volume on some other node, you’ll get following error:
FailedMount Failed to attach volume "pv0001" on node "xyz" with: googleapi: Error 400: The disk resource 'abc' is already being used by 'xyz'
References from above on this article
As mentioned here and here, NFS is the easiest way to get ReadWriteMany as all nodes need to be able to ReadWriteMany to the storage device you are using for your pods.
Then I would suggest you to use an NFS storage option. In case you want to test it, here is a good guide by Google using its Filestore solution which are fully managed NFS file servers.
Your PersistentVolumeClaim is set to:
accessModes:
- ReadWriteOnce
But it should be set to:
accessModes:
- ReadWriteMany
The ReadWriteOnce access mode means, that
the volume can be mounted as read-write by a single node [1].
When you scale your deployment it's most likely scaled to different nodes, therefore you need ReadWriteMany.
[1] https://kubernetes.io/docs/concepts/storage/persistent-volumes/