Statefulset with replicas : 1 pod has unbound immediate PersistentVolumeClaims - kubernetes

I'm trying to setup , in my single node cluster (Docker Desktop Windows), an elastic cluster.
For this, I have created the PV as followed (working)
apiVersion: v1
kind: PersistentVolume
metadata:
name: elastic-pv-data
labels:
type: local
spec:
storageClassName: elasticdata
accessModes:
- ReadWriteOnce
capacity:
storage: 20Gi
hostPath:
path: "/mnt/data/elastic"
Then here is the configuration :
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: esnode
spec:
selector:
matchLabels:
app: es-cluster # has to match .spec.template.metadata.labels
serviceName: elasticsearch
replicas: 2
updateStrategy:
type: RollingUpdate
template:
metadata:
labels:
app: es-cluster
spec:
securityContext:
fsGroup: 1000
initContainers:
- name: init-sysctl
image: busybox
imagePullPolicy: IfNotPresent
securityContext:
privileged: true
command: ["sysctl", "-w", "vm.max_map_count=262144"]
containers:
- name: elasticsearch
resources:
requests:
memory: 1Gi
securityContext:
privileged: true
runAsUser: 1000
capabilities:
add:
- IPC_LOCK
- SYS_RESOURCE
image: docker.elastic.co/elasticsearch/elasticsearch-oss:7.7.1
env:
- name: ES_JAVA_OPTS
valueFrom:
configMapKeyRef:
name: es-config
key: ES_JAVA_OPTS
readinessProbe:
httpGet:
scheme: HTTP
path: /_cluster/health?local=true
port: 9200
initialDelaySeconds: 5
ports:
- containerPort: 9200
name: es-http
- containerPort: 9300
name: es-transport
volumeMounts:
- name: es-data
mountPath: /usr/share/elasticsearch/data
volumeClaimTemplates:
- metadata:
name: es-data
spec:
storageClassName: elasticdata
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 3Gi
And the result is only one "pod" has its pvc binded to the pv, the other one gets an error loop "0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims".
Here is the kubectl get pv,pvc result :
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/elastic-pv-data 20Gi RWO Retain Bound default/es-data-esnode-0 elasticdata 14m
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/es-data-esnode-0 Bound elastic-pv-data 20Gi RWO elasticdata 13m
If I undestood correctly, I should have a second persistantolumeclaim with the following identifier : es-data-esnode-1
Is there something I miss or do not understand correctly ?
Thanks for your help
I skip here the non relevant parts (configmap,loadbalancer,..)

Let me add a few details to what was already said both in comments and in Jonas's answer.
Inferring from the comments, you've not defined a StorageClass named elasticdata. If it doesn't exist, you cannot reference it in your PV and PVC.
Take a quick look at how hostPath is used to define a PersistentVolume and how it is referenced in a PersistentVolumeClaim. Here you can see that in the example storageClassName: manual is used. Kubernetes docs doesn't say it explicitely but if you take a look at Openshift docs, it says very clearly that:
A Pod that uses a hostPath volume must be referenced by manual
(static) provisioning.
It's not just some value used to bind PVC request to this specific PV. So if the elasticdata StorageClass hasn't been defined, you should't use it here.
Second thing. As Jonas already stated in his comment, there is one-to-one binding between PVC and PV so no matter that your PV still has enough capacity, it has been already claimed by a different PVC and is not available any more. As you can read in the official docs:
A PVC to PV binding is a one-to-one mapping, using a ClaimRef which is
a bi-directional binding between the PersistentVolume and the
PersistentVolumeClaim.
Claims will remain unbound indefinitely if a matching volume does not
exist. Claims will be bound as matching volumes become available. For
example, a cluster provisioned with many 50Gi PVs would not match a
PVC requesting 100Gi. The PVC can be bound when a 100Gi PV is added to
the cluster.
And vice versa. If there is just one 100Gi PV it won't be able to safisfy a request from two PVCs claiming for 50Gi each. Note that in the result of kubectl get pv,pvc you posted, both PV and PVC have capacity of 20Gi although you request in each PVC created from PVC template only 3Gi.
You don't work here with any dynamic storage provisioner so you need to provide manually as many PersistentVolumes as needed for your use case.
By the way, instead of using hostPath I would rather recommend you using local volume with properly defined StorageClass. It has a few advantages over HostPath. Additionally an external static provisioner can be run separately for improved management of the local volume lifecycle

When using a StatefulSet with volumeClaimTemplates, it will create a PersistentVolumeClaim for each replica. So if you use replicas: 2, two different PersistentVolumeClaims will be created, es-data-esnode-0 and es-data-esnode-1.
Each PersistentVolumeClaim will bound to an unique PersistentVolume, so in the case of two PVCs, you would need two different PersistentVolumes. But this is not easy to do using volumeClaimTemplate and hostPath-volumes in a desktop setup.
For what reasons do you need replicas: 2 in this case? It is usually used to provide better availability, e.g. using more than one node. But for a local setup in a desktop environment, usually a single replica on the single node should be fine? I think the easies solution for you is to use replicas: 1.

Related

0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims

As the documentation states:
For each VolumeClaimTemplate entry defined in a StatefulSet, each Pod
receives one PersistentVolumeClaim. In the nginx example above, each
Pod receives a single PersistentVolume with a StorageClass of
my-storage-class and 1 Gib of provisioned storage. If no StorageClass
is specified, then the default StorageClass will be used. When a Pod
is (re)scheduled onto a node, its volumeMounts mount the
PersistentVolumes associated with its PersistentVolume Claims. Note
that, the PersistentVolumes associated with the Pods' PersistentVolume
Claims are not deleted when the Pods, or StatefulSet are deleted. This
must be done manually.
The part I'm interested in is this: If no StorageClassis specified, then the default StorageClass will be used
I create a StatefulSet like this:
apiVersion: apps/v1
kind: StatefulSet
metadata:
namespace: ches
name: ches
spec:
serviceName: ches
replicas: 1
selector:
matchLabels:
app: ches
template:
metadata:
labels:
app: ches
spec:
serviceAccountName: ches-serviceaccount
nodeSelector:
ches-worker: "true"
volumes:
- name: data
hostPath:
path: /data/test
containers:
- name: ches
image: [here I have the repo]
imagePullPolicy: Always
securityContext:
privileged: true
args:
- server
- --console-address
- :9011
- /data
env:
- name: MINIO_ACCESS_KEY
valueFrom:
secretKeyRef:
name: ches-keys
key: access-key
- name: MINIO_SECRET_KEY
valueFrom:
secretKeyRef:
name: ches-keys
key: secret-key
ports:
- containerPort: 9000
hostPort: 9011
resources:
limits:
cpu: 100m
memory: 200Mi
volumeMounts:
- name: data
mountPath: /data
imagePullSecrets:
- name: edge-storage-token
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
Of course I have already created the secrets, imagePullSecrets etc and I have labeled the node as ches-worker.
When I apply the yaml file, the pod is in Pending status and kubectl describe pod ches-0 -n ches gives the following error:
Warning FailedScheduling 6s default-scheduler 0/1 nodes are
available: 1 pod has unbound immediate PersistentVolumeClaims.
preemption: 0/1 nodes are available: 1 Preemption is not helpful for
scheduling
Am I missing something here?
You need to create a PV in order to get a PVC bound. If you want the PVs automatically created from PVC claims you need a Provisioner installed in your Cluster.
First create a PV with at least the amout of space need by your PVC.
Then you can apply your deployment yaml which contains the PVC claim.
K3s when installed, also downloads a storage class which makes it as default.
Check with kubectl get storageclass:
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE
ALLOWVOLUMEEXPANSION AGE local-path rancher.io/local-path Delete
WaitForFirstConsumer false 8s
K8s cluster on the other hand, does not download also a default storage class.
In order to solve the problem:
Download rancher.io/local-path storage class:
kubectl apply -f
https://raw.githubusercontent.com/rancher/local-path-provisioner/master/deploy/local-path-storage.yaml
Check with kubectl get storageclass
Make this storage class (local-path) the default:
kubectl patch
storageclass local-path -p '{"metadata":
{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

Error: `pod has unbound immediate PersistentVolumeClaims` while deploying microservices on local machine

When I try to deploy my microservices locally, I get error regarding volumes. I've trimmed down all other configs and provided only the troubling portion here.
Persistent Volume:
apiVersion: v1
kind: PersistentVolume
metadata:
name: service-1-db-pv
spec:
capacity:
storage: 250Mi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
storageClassName: ''
hostPath:
path: /mnt/wsl/service-1-pv
type: DirectoryOrCreate
Persistent Volume Claim:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: service-1-db-pvc
spec:
volumeName: service-1-db-pv
resources:
requests:
storage: 250Mi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
storageClassName: ''
Service:
apiVersion: v1
kind: Service
metadata:
name: service-service-1-db
spec:
selector:
app: service-1-db
ports:
- protocol: TCP
port: 27017
targetPort: 27017
Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: deployment-service-1-db
spec:
selector:
matchLabels:
app: service-1-db
template:
metadata:
labels:
app: service-1-db
spec:
containers:
- name: service-1-db
image: mongo:latest
volumeMounts:
- name: service-1-db-volume
mountPath: /data/db
resources:
requests:
cpu: 250m
memory: 128Mi
limits:
cpu: 1000m
memory: 256Mi
volumes:
- name: service-1-db-volume
persistentVolumeClaim:
claimName: service-1-db-pvc
When I try to run skaffold run --tail, I get the following output:
Starting deploy...
- persistentvolume/service-1-db-pv created
- persistentvolumeclaim/service-1-db-pvc created
- service/service-service-1-db created
- deployment.apps/deployment-service-1-db created
Waiting for deployments to stabilize...
- deployment/deployment-service-1-db: 0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.
- pod/deployment-service-1-db-6f9b896485-mv8qx: 0/1 nodes are available: 1 pod has unbound immediate PersistentVolumeClaims.
- deployment/deployment-service-1-db is ready.
Deployments stabilized in 22.23 seconds
I can't figure out what went wrong. Followed this and this.
The "pod has unbound PVC" suggests that the PersistentVolumeClaim associated with your Pods are ... not bound. Meaning your volume provisioner is probably still waiting for a confirmation the corresponding volume was created, before marking your PVC as bound.
Considering your last log mentions deployment being ready, then there isn't much to worry about.
One thing you could look for is your StorageClass VolumeBindingMode:
If "Immediate", then your provisioner would try to create a PV as soon as your register a new PVC.
If "OnDemand", then Kubernetes would wait for a Pod to try (and fail) attaching your volume once, and only then the PV creation process would start.
Although if you're creating both your PVC and Deployment relatively simultaneously, this won't change much.
There's nothing critical here. Although if such error persist: maybe something is wrong either with your volume provisioner, or even more likely: your storage provider. Eg, with Ceph, when you're missing monitors: you won't be able to create new volumes - though you may still read/write existing ones.
edit, answering your comment:
There isn't much that can be done.
First: make sure your StorageClass VolumeBindingMode is set to Immediate -- otherwise, there won't be any provisioning before you create a Pod attaching that volume.
Next, you can look into the Operator SDK, or anything that can query the API (Ansible, python, ... shell script), such as you may implement something that would wait for your PVC status to confirm provisioning suceeded.
Then again, there's no guarantee your deployment would always be applied to clusters that offer Immediate volume binding. And there's nothing wrong with OnDemand -- on larger clusters, with lots of users that don't necessarily clean up objects, ... it's not unusual.
Those events you mention arguably are not errors. Even with Immediate binding. It's perfectly normal for the Pod controller to wait for volumes to be properly registered and ready to use.

RabbitMQ Install - pod has unbound immediate PersistentVolumeClaims

I am trying to do a install of RabbitMQ in Kubernetes and following the entry on the RabbitMQ site https://www.rabbitmq.com/blog/2020/08/10/deploying-rabbitmq-to-kubernetes-whats-involved/.
Please note my CentOS 7 and Kubernetes 1.18. Also, I am not even sure this is the best way to deploy RabbitMQ, its the best documentation I could find though. I did find something that said that volumeClaimTemplates does not support NFS so I am wondering if that is the issue.
I have added the my Persistent Volume using NFS:
kind: PersistentVolume
apiVersion: v1
metadata:
name: rabbitmq-nfs-pv
namespace: ninegold-rabbitmq
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
nfs:
path: /var/nfsshare
server: 192.168.1.241
persistentVolumeReclaimPolicy: Retain
It created the PV correctly.
[admin#centos-controller ~]$ kubectl get pv -n ninegold-rabbitmq
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
ninegold-platform-custom-config-br 1Gi RWX Retain Bound ninegold-platform/ninegold-db-pgbr-repo 22d
ninegold-platform-custom-config-pgadmin 1Gi RWX Retain Bound ninegold-platform/ninegold-db-pgadmin 21d
ninegold-platform-custom-config-pgdata 1Gi RWX Retain Bound ninegold-platform/ninegold-db 22d
rabbitmq-nfs-pv 5Gi RWO Retain Available 14h
I then add my StatefulSet.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: rabbitmq
namespace: ninegold-rabbitmq
spec:
selector:
matchLabels:
app: "rabbitmq"
# headless service that gives network identity to the RMQ nodes, and enables them to cluster
serviceName: rabbitmq-headless # serviceName is the name of the service that governs this StatefulSet. This service must exist before the StatefulSet, and is responsible for the network identity of the set. Pods get DNS/hostnames that follow the pattern: pod-specific-string.serviceName.default.svc.cluster.local where "pod-specific-string" is managed by the StatefulSet controller.
volumeClaimTemplates:
- metadata:
name: rabbitmq-data
namespace: ninegold-rabbitmq
spec:
storageClassName: local-storage
accessModes:
- ReadWriteOnce
resources:
requests:
storage: "5Gi"
template:
metadata:
name: rabbitmq
namespace: ninegold-rabbitmq
labels:
app: rabbitmq
spec:
initContainers:
# Since k8s 1.9.4, config maps mount read-only volumes. Since the Docker image also writes to the config file,
# the file must be mounted as read-write. We use init containers to copy from the config map read-only
# path, to a read-write path
- name: "rabbitmq-config"
image: busybox:1.32.0
volumeMounts:
- name: rabbitmq-config
mountPath: /tmp/rabbitmq
- name: rabbitmq-config-rw
mountPath: /etc/rabbitmq
command:
- sh
- -c
# the newline is needed since the Docker image entrypoint scripts appends to the config file
- cp /tmp/rabbitmq/rabbitmq.conf /etc/rabbitmq/rabbitmq.conf && echo '' >> /etc/rabbitmq/rabbitmq.conf;
cp /tmp/rabbitmq/enabled_plugins /etc/rabbitmq/enabled_plugins
volumes:
- name: rabbitmq-config
configMap:
name: rabbitmq-config
optional: false
items:
- key: enabled_plugins
path: "enabled_plugins"
- key: rabbitmq.conf
path: "rabbitmq.conf"
# read-write volume into which to copy the rabbitmq.conf and enabled_plugins files
# this is needed since the docker image writes to the rabbitmq.conf file
# and Kubernetes Config Maps are mounted as read-only since Kubernetes 1.9.4
- name: rabbitmq-config-rw
emptyDir: {}
- name: rabbitmq-data
persistentVolumeClaim:
claimName: rabbitmq-data
serviceAccount: rabbitmq
# The Docker image runs as the `rabbitmq` user with uid 999
# and writes to the `rabbitmq.conf` file
# The security context is needed since the image needs
# permission to write to this file. Without the security
# context, `rabbitmq.conf` is owned by root and inaccessible
# by the `rabbitmq` user
securityContext:
fsGroup: 999
runAsUser: 999
runAsGroup: 999
containers:
- name: rabbitmq
# Community Docker Image
image: rabbitmq:latest
volumeMounts:
# mounting rabbitmq.conf and enabled_plugins
# this should have writeable access, this might be a problem
- name: rabbitmq-config-rw
mountPath: "/etc/rabbitmq"
# mountPath: "/etc/rabbitmq/conf.d/"
mountPath: "/var/lib/rabbitmq"
# rabbitmq data directory
- name: rabbitmq-data
mountPath: "/var/lib/rabbitmq/mnesia"
env:
- name: RABBITMQ_DEFAULT_PASS
valueFrom:
secretKeyRef:
name: rabbitmq-admin
key: pass
- name: RABBITMQ_DEFAULT_USER
valueFrom:
secretKeyRef:
name: rabbitmq-admin
key: user
- name: RABBITMQ_ERLANG_COOKIE
valueFrom:
secretKeyRef:
name: erlang-cookie
key: cookie
ports:
- name: amqp
containerPort: 5672
protocol: TCP
- name: management
containerPort: 15672
protocol: TCP
- name: prometheus
containerPort: 15692
protocol: TCP
- name: epmd
containerPort: 4369
protocol: TCP
livenessProbe:
exec:
# This is just an example. There is no "one true health check" but rather
# several rabbitmq-diagnostics commands that can be combined to form increasingly comprehensive
# and intrusive health checks.
# Learn more at https://www.rabbitmq.com/monitoring.html#health-checks.
#
# Stage 2 check:
command: ["rabbitmq-diagnostics", "status"]
initialDelaySeconds: 60
# See https://www.rabbitmq.com/monitoring.html for monitoring frequency recommendations.
periodSeconds: 60
timeoutSeconds: 15
readinessProbe: # probe to know when RMQ is ready to accept traffic
exec:
# This is just an example. There is no "one true health check" but rather
# several rabbitmq-diagnostics commands that can be combined to form increasingly comprehensive
# and intrusive health checks.
# Learn more at https://www.rabbitmq.com/monitoring.html#health-checks.
#
# Stage 1 check:
command: ["rabbitmq-diagnostics", "ping"]
initialDelaySeconds: 20
periodSeconds: 60
timeoutSeconds: 10
However my stateful set is not binding, I am getting the following error:
running "VolumeBinding" filter plugin for pod "rabbitmq-0": pod has unbound immediate PersistentVolumeClaims
The PVC did not correctly bind to the PV but stays in pending state.
[admin#centos-controller ~]$ kubectl get pvc -n ninegold-rabbitmq
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
rabbitmq-data-rabbitmq-0 Pending local-storage 14h
I have double checked the capacity, accessModes, I am not sure why this is not binding.
My example came from here https://github.com/rabbitmq/diy-kubernetes-examples/tree/master/gke, the only changes I have done is to bind my NFS volume.
Any help would be appreciated.
In your YAMLs I found some misconfigurations.
local-storage class.
I assume, you used Documentation example to create local-storage. There is information that:
Local volumes do not currently support dynamic provisioning, however a StorageClass should still be created to delay volume binding until Pod scheduling.
When you want to use volumeClaimTemplates, you will use Dynamic Provisioning. It's well explained in Medium article.
PV in StatefulSet
Specifically to the volume part, StatefulSet provides a key named as volumeClaimTemplates. With that, you can request the PVC from the storage class dynamically. As part of your new statefulset app definition, replace the volumes ... The PVC is named as volumeClaimTemplate name + pod-name + ordinal number.
As local-storage does not support dynamic provisioning, it will not work. You would need to use NFS storageclass with proper provisioner or create PV manually.
Also, when you are using volumeClaimTemplates for each pod it will create Pv and PVC. In addition PVC and PV are bounding in 1:1 relationship. For more details you can check this SO thread.
Error unbound immediate PersistentVolumeClaims
It means that dynamic provisioning didn't work as expected. If you would check kubectl get pv,pvc you would not see any new PV,PVC with name: volumeClaimTemplate name + pod-name + ordinal number.
claimName: rabbitmq-data
I assume, in this claim you wanted to mount it to PV created by volumeClaimTemplates but it was not created. Also PV would have name rabbitmq-data-rabbitmq-0 for first pod and rabbitmq-data-rabbitmq-1 for the second one.
As last part, this article - Kubernetes : NFS and Dynamic NFS provisioning might be helpful.

How can I mount the same persistent volume on multiple pods?

I have a three node GCE cluster and a single-pod GKE deployment with three replicas. I created the PV and PVC like so:
# Create a persistent volume for web content
apiVersion: v1
kind: PersistentVolume
metadata:
name: nginx-content
labels:
type: local
spec:
capacity:
storage: 5Gi
accessModes:
- ReadOnlyMany
hostPath:
path: "/usr/share/nginx/html"
--
# Request a persistent volume for web content
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: nginx-content-claim
annotations:
volume.alpha.kubernetes.io/storage-class: default
spec:
accessModes: [ReadOnlyMany]
resources:
requests:
storage: 5Gi
They are referenced in the container spec like so:
spec:
containers:
- image: launcher.gcr.io/google/nginx1
name: nginx-container
volumeMounts:
- name: nginx-content
mountPath: /usr/share/nginx/html
ports:
- containerPort: 80
volumes:
- name: nginx-content
persistentVolumeClaim:
claimName: nginx-content-claim
Even though I created the volumes as ReadOnlyMany, only one pod can mount the volume at any given time. The rest give "Error 400: RESOURCE_IN_USE_BY_ANOTHER_RESOURCE". How can I make it so all three replicas read the same web content from the same volume?
First I'd like to point out one fundamental discrapency in your configuration. Note that when you use your PersistentVolumeClaim defined as in your example, you don't use your nginx-content PersistentVolume at all. You can easily verify it by running:
kubectl get pv
on your GKE cluster. You'll notice that apart from your manually created nginx-content PV, there is another one, which was automatically provisioned based on the PVC that you applied.
Note that in your PersistentVolumeClaim definition you're explicitely referring the default storage class which has nothing to do with your manually created PV. Actually even if you completely omit the annotation:
annotations:
volume.alpha.kubernetes.io/storage-class: default
it will work exactly the same way, namely the default storage class will be used anyway. Using the default storage class on GKE means that GCE Persistent Disk will be used as your volume provisioner. You can read more about it here:
Volume implementations such as gcePersistentDisk are configured
through StorageClass resources. GKE creates a default StorageClass for
you which uses the standard persistent disk type (ext4). The default
StorageClass is used when a PersistentVolumeClaim doesn't specify a
StorageClassName. You can replace the provided default StorageClass
with your own.
But let's move on to the solution of the problem you're facing.
Solution:
First, I'd like to emphasize you don't have to use any NFS-like filesystems to achive your goal.
If you need your PersistentVolume to be available in ReadOnlyMany mode, GCE Persistent Disk is a perfect solution that entirely meets your requirements.
It can be mounted in ro mode by many Pods at the same time and what is even more important by many Pods, scheduled on different GKE nodes. Furthermore it's really simple to configure and it works on GKE out of the box.
In case you want to use your storage in ReadWriteMany mode, I agree that something like NFS may be the only solution as GCE Persistent Disk doesn't provide such capability.
Let's take a closer look how we can configure it.
We need to start from defining our PVC. This step was actually already done by yourself but you got lost a bit in further steps. Let me explain how it works.
The following configuration is correct (as I mentioned annotations section can be omitted):
# Request a persistent volume for web content
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: nginx-content-claim
spec:
accessModes: [ReadOnlyMany]
resources:
requests:
storage: 5Gi
However I'd like to add one important comment to this. You said:
Even though I created the volumes as ReadOnlyMany, only one pod can
mount the volume at any given time.
Well, actually you didn't. I know it may seem a bit tricky and somewhat surprising but this is not the way how defining accessModes really works. In fact it's a widely misunderstood concept. First of all you cannot define access modes in PVC in a sense of putting there the constraints you want. Supported access modes are inherent feature of a particular storage type. They are already defined by the storage provider.
What you actually do in PVC definition is requesting a PV that supports the particular access mode or access modes. Note that it's in a form of a list which means you may provide many different access modes that you want your PV to support.
Basically it's like saying: "Hey! Storage provider! Give me a volume that supports ReadOnlyMany mode." You're asking this way for a storage that will satisfy your requirements. Keep in mind however that you can be given more than you ask. And this is also our scenario when asking for a PV that supports ReadOnlyMany mode in GCP. It creates for us a PersistentVolume which meets our requirements we listed in accessModes section but it also supports ReadWriteOnce mode. Although we didn't ask for something that also supports ReadWriteOnce you will probably agree with me that storage which has a built-in support for those two modes fully satisfies our request for something that supports ReadOnlyMany. So basically this is the way it works.
Your PV that was automatically provisioned by GCP in response for your PVC supports those two accessModes and if you don't specify explicitely in Pod or Deployment definition that you want to mount it in read-only mode, by default it is mounted in read-write mode.
You can easily verify it by attaching to the Pod that was able to successfully mount the PersistentVolume:
kubectl exec -ti pod-name -- /bin/bash
and trying to write something on the mounted filesystem.
The error message you get:
"Error 400: RESOURCE_IN_USE_BY_ANOTHER_RESOURCE"
concerns specifically GCE Persistent Disk that is already mounted by one GKE node in ReadWriteOnce mode and it cannot be mounted by another node on which the rest of your Pods were scheduled.
If you want it to be mounted in ReadOnlyMany mode, you need to specify it explicitely in your Deployment definition by adding readOnly: true statement in the volumes section under Pod's template specification like below:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
volumeMounts:
- mountPath: "/usr/share/nginx/html"
name: nginx-content
volumes:
- name: nginx-content
persistentVolumeClaim:
claimName: nginx-content-claim
readOnly: true
Keep in mind however that to be able to mount it in readOnly mode, first we need to pre-populate such volume with data. Otherwise you'll see another error message, saying that unformatted volume cannot be mounted in read only mode.
The easiest way to do it is by creating a single Pod which will serve only for copying data which was already uploaded to one of our GKE nodes to our destination PV.
Note that pre-populating PersistentVolume with data can be done in many different ways. You can mount in such Pod only your PersistentVolume that you will be using in your Deployment and get your data using curl or wget from some external location saving it directly on your destination PV. It's up to you.
In my example I'm showing how to do it using additional local volume that allows us to mount into our Pod a directory, partition or disk (in my example I use a directory /var/tmp/test located on one of my GKE nodes) available on one of our kubernetes nodes. It's much more flexible solution than hostPath as we don't have to care about scheduling such Pod to particular node, that contains the data. Specific node affinity rule is already defined in PersistentVolume and Pod is automatically scheduled on specific node.
To create it we need 3 things:
StorageClass:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
PersistentVolume definition:
apiVersion: v1
kind: PersistentVolume
metadata:
name: example-pv
spec:
capacity:
storage: 10Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Delete
storageClassName: local-storage
local:
path: /var/tmp/test
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- <gke-node-name>
and finally PersistentVolumeClaim:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: myclaim
spec:
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
resources:
requests:
storage: 10Gi
storageClassName: local-storage
Then we can create our temporary Pod which will serve only for copying data from our GKE node to our GCE Persistent Disk.
apiVersion: v1
kind: Pod
metadata:
name: mypod
spec:
containers:
- name: myfrontend
image: nginx
volumeMounts:
- mountPath: "/mnt/source"
name: mypd
- mountPath: "/mnt/destination"
name: nginx-content
volumes:
- name: mypd
persistentVolumeClaim:
claimName: myclaim
- name: nginx-content
persistentVolumeClaim:
claimName: nginx-content-claim
Paths you can see above are not really important. The task of this Pod is only to allow us to copy our data to the destination PV. Eventually our PV will be mounted in completely different path.
Once the Pod is created and both volumes are successfully mounted, we can attach to it by running:
kubectl exec -ti my-pod -- /bin/bash
Withing the Pod simply run:
cp /mnt/source/* /mnt/destination/
That's all. Now we can exit and delete our temporary Pod:
kubectl delete pod mypod
Once it is gone, we can apply our Deployment and our PersistentVolume finally can be mounted in readOnly mode by all the Pods located on various GKE nodes:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
volumeMounts:
- mountPath: "/usr/share/nginx/html"
name: nginx-content
volumes:
- name: nginx-content
persistentVolumeClaim:
claimName: nginx-content-claim
readOnly: true
Btw. if you are ok with the fact that your Pods will be scheduled only on one particular node, you can give up on using GCE Persistent Disk at all and switch to the above mentioned local volume. This way all your Pods will be able not only to read from it but also to write to it at the same time. The only caveat is that all those Pods will be running on a single node.
You can achieve this with a NFS like file system. On Google Cloud, Filestore is the right product for this (NFS managed). You have a tutorial here for achieving your configuration
You will need to use a shared volume claim with ReadWriteMany (RWX) type if you want to share the volume across different nodes and provide highly scalable solution. Like using NFS server.
You can find out how to deploy an NFS server here:
https://www.shebanglabs.io/run-nfs-server-on-ubuntu-20-04/
And then you can mount volumes (directories from NFS server) as follows:
https://www.shebanglabs.io/how-to-set-up-read-write-many-rwx-persistent-volumes-with-nfs-on-kubernetes/
I've used such a way to deliver shared static content between +8 k8s deployments (+200 pods) serving 1 Billion requests a month over Nginx. and it did work perfectly with that NFS setup :)
Google provides NFS like filesystem called as Google Cloud Filestore. You can mount that on multiple pods.

how to find my persistent volume location

I tried creating persistent volume using the host path. I can bind it to a specific node using node affinity but I didn't provide that. My persistent volume YAML looks like this
apiVersion: v1
kind: PersistentVolume
metadata:
name: jenkins
labels:
type: fast
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Recycle
hostPath:
path: /mnt/data
After this I created PVC
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: myclaim
spec:
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
resources:
requests:
storage: 1Gi
And finally attached it onto the pod.
apiVersion: v1
kind: Pod
metadata:
name: mypod
spec:
containers:
- name: myfrontend
image: thinkingmonster/nettools
volumeMounts:
- mountPath: "/var/www/html"
name: mypd
volumes:
- name: mypd
persistentVolumeClaim:
claimName: myclaim
Now in describe command for pv or pvc it does not tell that on which node it has actually kept the volume /mnt/data
and I had to ssh to all nodes to locate the same.
And pod is smart enough to be created on that node only where Kubernetes had mapped host directory to PV
How can I know that on which node Kubernetes has created Persistent volume? Without the requirement to ssh the nodes or check that where is pod running.
It's only when a volume is bound to a claim that it's associated with a particular node. HostPath volumes are a bit different than the regular sort, making it a little less clear. When you get the volume claim, the annotations on it should give you a bunch of information, including what you're looking for. In particular, look for the:
volume.kubernetes.io/selected-node: ${NODE_NAME}
annotation on the PVC. You can see the annotations, along with the other computed configuration, by asking the Kubernetes api server for that info:
kubectl get pvc -o yaml -n ${NAMESPACE} ${PVC_NAME}