Kubernetes: Using an ordinal number in a claimName? - kubernetes

I have a statefulset that is running great and the stateful set has ReadWriteMany PVC. I need to share this PVC with another statefulset.
Does anybody know how I can add the ordinal number into the claimName.
Basically I have a backendService that is a statefulset with 2 replicas so it has a volumeClaimTemplate defined - hence it has 2 volumes service-data-service-0 and service-data-service-1 for example.
In the other statefulset - it has its own data volume but I need to share the data volume from the other statefulset.
There is a one to one mapping - meaning that the volume with ordinal 0 in the lower service needs to be added to pod0 and the same for volume with ordinal 1 to pod1.
I am little confused how I am able to do this. Its easy with a deployment, because technically you have 2 x deployments.. SO each deployment can be strictly sent to the correct service-data-service- XX (Where XX is the ordinal number of the lower server i.e 0,1 etc)
In my head, psuedo code - I have this. Can anyone help ?
volumes:
- name: lnd2-data-volume
persistentVolumeClaim:
# This volumes section is in the higher service but shares a data volume
# with the lower service
claimName: service-data-service-{{ "SOME TEMPLATE HERE to give me either 0 or 1 for the current POD ordinal number }}
Any ideas ?

To see TLDR version please go to the solution below.
What you are trying to achieve is not doable in Statefulset (STS) today.
Claims due to the design of StatefulSet controller need to have a unique identifiers, in order to be mapped to their corresponding pods, and cannot be reused between different StatefulSet applications.
So no matter, what claim name you specify within Volumes as part of Pod's template inside StatefulSet definition (e.g. claimName=service-data-service-0), it will be always overwritten by StatefulSet controller for each controlled by it Pod using the following naming scheme:
PVC name = claim.Name + set.Name + ordinal number
where:
claim.Name - claim on the list of STS's volumeClaimTemplates matching 'volumeMount' in PodTemplateSpec
set.Name - StatefulSet name
ordinal - Pod's (replicas - 1)
My observations:
The existing PVC (of ReadWriteMany mode) can be used by STS, only when you introduce the StatefulSet for the first time in your cluster (=it's not owned yet by other workload).
For example, the STS like this one:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: peb
spec:
...
volumeClaimTemplates:
- metadata:
name: fileserver-claim
spec:
accessModes: [ "ReadWriteMany" ]
storageClassName: ""
resources:
requests:
storage: 1Gi
would consume the existing PVC:
fileserver-claim-peb-0
with accompanying event seen in API server logs:
The PVC 'fileserver-claim-peb-0' already exists
and because there cannot be any different STS of the same name (Pod 'peb-0' is unique in the cluster likewise its claimName), your options are over here.
Solution:
Pre-provision manually couple of PVs, that use the same associated storage asset (e.g NFS based volumes supporting RWX access mode) and inside your STS on the list of PVCs reference by name (volumeName) the existing unbound PV, e.g:
...
volumeClaimTemplates:
- metadata:
name: datadir
spec:
accessModes:
- "ReadWriteOnce"
volumeName: fileserver-claim-peb
resources:
requests:
storage: 1Gi
I think this is a recipe to share the same data storage between different StatefulSet(s).

Related

What is the PersistentVolumeClaim policy for local PersistentVolume in Kubernetes?

Scenario 1:
I have 3 local-persistent-volumes provisioned, each pv is mounted on different node:
10.30.18.10
10.30.18.11
10.30.18.12
When I start my app with 3 replicas using:
kind: StatefulSet
metadata:
name: my-db
spec:
replicas: 3
...
...
volumeClaimTemplates:
- metadata:
name: my-local-vol
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "my-local-sc"
resources:
requests:
storage: 10Gi
Then I notice pods and pvs are on the same host:
pod1 with ip 10.30.18.10 has claimed the pv that is mounted on 10.30.18.10
pod2 with ip 10.30.18.11 has claimed the pv that is mounted on 10.30.18.11
pod3 with ip 10.30.18.12 has claimed the pv that is mounted on 10.30.18.12
(whats not happening is: pod1 with ip 10.30.18.10 has claimed the pv that is mounted on different node 10.30.18.12 etc)
The only common config between pv and pvc is storageClassName, so I didn't configure this behavior.
Question:
So, who is responsible for this magic? Kubernetes scheduler? Kubernetes provisioner?
Scenario 2:
I have 3 local-persistent-volumes provisioned:
pv1 has capacity.storage of 10Gi
pv2 has capacity.storage of 100Gi
pv3 has capacity.storage of 100Gi
Now, I start my app with 1 replica
kind: StatefulSet
metadata:
name: my-db
spec:
replicas: 1
...
...
volumeClaimTemplates:
- metadata:
name: my-local-vol
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "my-local-sc"
resources:
requests:
storage: 10Gi
I want to ensure that this StatefulSet always claim pv1 (10Gi) even this is on a different node, and don't claim pv2 (100Gi) and pv3 (100Gi)
Question:
Does this happen automatically?
How do I ensure the desired behavior? Should I use a separate storageClassName to ensure this?
What is the PersistentVolumeClaim policy? Where can I find more info?
EDIT:
yml used for StorageClass:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: my-local-pv
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
With local Persistent Volumes, this is the expected behaviour. Let me try to explain what happens when using local storage.
The usual setup for local storage on a cluster is the following:
A local storage class, configured to be WaitForFirstConsumer
A series of local persistent volumes, linked to the local storage class
And this is all well documented with examples in the official documentation: https://kubernetes.io/docs/concepts/storage/volumes/#local
With this done, Persistent Volume Claims can request storage from the local storage class and StatefulSets can have a volumeClaimTemplate which requests storage of the local storage class.
Let me take as example your StatefulSet with 3 replicas, each one requires local storage with the volumeClaimTemplate.
When the Pods are first created, they request a storage of the required storageClass. For example your my-local-sc
Since this storage class is manually created and does not support dynamically provisioning of new PVs (like, for example, Ceph or similar) it is checked if a PV attached to the storage class is available to be bound.
If a PV is selected, it is bound to the newly created PVC (and from now, can be used only with that particular PV, since it is now Bound)
Since the PV is of type local, the PV has a nodeAffinity required which selects a node.
This force the Pod, now bound to that PV, to be scheduled only on that particular node.
This is why each Pod was scheduled on the same node of the bounded persistent volume. And this means that the Pod is restricted to run on that node only.
You can test this easily by draining / cordoning one of the nodes and then trying to restart the Pod bound to the PV available on that particular node. What you should see is that the Pod will not start, as the PV is restricted from its nodeAffinity and the node is not available.
Once each Pod of the StatefulSet is bound to a PV, that Pod will be scheduled only on a specific node.. Pods will not change the PV that they are using, unless the PVC is removed (which will force the Pod to request again a new PV to bound)
Since local storage is handled manually, PV which were bounded and have the related PVC removed from the cluster, enter in Released state and cannot be claimed anymore, they must be handled by someone.. maybe deleting them and then recreating new ones at the same location (and maybe cleaning the filesystem as well, depending on the situation)
This means that local storage is OK to be used only:
If HA is not a problem.. for example, I don't care if my app is blocked by a single node not working
If HA is handled directly by the app itself. For example, a StatefulSet with 3 Pods like a multi-primary database (Galera, Clickhouse, Percona for examples) or ElasticSearch or Kafka, Zookeeper or something like that.. all will handle the HA on their own as they can resist one of their nodes being down as long as there's quorum.
UPDATE
Regarding the Scenario 2 of your question. Let's say you have multiple Available PVs and a single Pod which starts and wants to Bound to one of them. This is a normal behaviour and the control plane would select one of those PVs on its own (if they match with the requests in Claim)
There's a specific way to pre-bind a PV and a PVC, so that they will always bind together. This is described in the docs as "reserving a PV": https://kubernetes.io/docs/concepts/storage/persistent-volumes/#reserving-a-persistentvolume
But the problem is that this cannot be applied to olume claim templates, as it requires the claim to be created manually with special properties.
The volume claim template tho, as a selector field which can be used to restrict the selection of a PV based on labels. It can be seen in the API specs ( https://v1-18.docs.kubernetes.io/docs/reference/generated/kubernetes-api/v1.18/#persistentvolumeclaimspec-v1-core )
When you create a PV, you label it with what you want.. for example you could label it like the following:
apiVersion: v1
kind: PersistentVolume
metadata:
name: example-small-pv
labels:
size-category: small
spec:
capacity:
storage: 10Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: local-storage
local:
path: /mnt/disks/ssd1
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- example-node-1
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: example-big-pv
labels:
size-category: big
spec:
capacity:
storage: 100Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: local-storage
local:
path: /mnt/disks/ssd1
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- example-node-2
And then the claim template can select a category of volumes based on the label. Or maybe it doesn't care so it doesn't specify selector and can use all of them (provided that the size is enough for its claim request)
This could be useful.. but it's not the only way to select or restrict which PVs can be selected, because when the PV is first bound, if the storage class is WaitForFirstConsumer, the following is also applied:
Delaying volume binding ensures that the PersistentVolumeClaim binding
decision will also be evaluated with any other node constraints the
Pod may have, such as node resource requirements, node selectors, Pod
affinity, and Pod anti-affinity.
Which means that if the Pod has a node affinity to one of the nodes, it will select for sure a PV on that node (if the local storage class used is WaitForFirstConsumer)
Last, let me quote the offical documentation for things that I think they could answer your questions:
From https://kubernetes.io/docs/concepts/storage/persistent-volumes/
A user creates, or in the case of dynamic provisioning, has already
created, a PersistentVolumeClaim with a specific amount of storage
requested and with certain access modes. A control loop in the master
watches for new PVCs, finds a matching PV (if possible), and binds
them together. If a PV was dynamically provisioned for a new PVC, the
loop will always bind that PV to the PVC. Otherwise, the user will
always get at least what they asked for, but the volume may be in
excess of what was requested. Once bound, PersistentVolumeClaim binds
are exclusive, regardless of how they were bound. A PVC to PV binding
is a one-to-one mapping, using a ClaimRef which is a bi-directional
binding between the PersistentVolume and the PersistentVolumeClaim.
Claims will remain unbound indefinitely if a matching volume does not
exist. Claims will be bound as matching volumes become available. For
example, a cluster provisioned with many 50Gi PVs would not match a
PVC requesting 100Gi. The PVC can be bound when a 100Gi PV is added to
the cluster.
From https://kubernetes.io/docs/concepts/storage/volumes/#local
Compared to hostPath volumes, local volumes are used in a durable and
portable manner without manually scheduling pods to nodes. The system
is aware of the volume's node constraints by looking at the node
affinity on the PersistentVolume.
However, local volumes are subject to the availability of the
underlying node and are not suitable for all applications. If a node
becomes unhealthy, then the local volume becomes inaccessible by the
pod. The pod using this volume is unable to run. Applications using
local volumes must be able to tolerate this reduced availability, as
well as potential data loss, depending on the durability
characteristics of the underlying disk.

How to store my pod logs in a persistent storage?

I have generated logs for my pods using kubectl logs 'pod name. But I want to persist these logs in a volume (some kind of persistent storage), because container logs will get wiped out if the pods go down. Is there a way to do this? Do I have to write some sort of a script?
I have read many answers but I still do not understand how to go about it, any help is appreciated. Thanks!
Under Logging Architecture Kubernetes documents goes thru couple of way to set up loggin in your cluster.
The most interesting for you might be Cluster-level logging architecture:
While Kubernetes does not provide a native solution for cluster-level
logging, there are several common approaches you can consider. Here
are some options:
Use a node-level logging agent that runs on every node.
Include a dedicated sidecar container for logging in an application pod.
Push logs directly to a backend from within an application
There are many solutions for collecting pod logs and shipping them to a centralized location such as:
fluentd
splunk
elastic
Keeping logs outside of cluster has benefits. If you cluster begins to have issues its more likely that your inside logging architecure will also face them.
You will need to mount the logs directory inside the container to the host machine as well, using the PersistentVolume and PersistentVolumeClaim.
This way you can persist these logs even if the container is killed.
Create the PersistentVolume and PersistentVolumeClaim for the log path and use them as volume mounts to the kubernetes deployments or pods.
I know this is an old question, but I've just had the same problem and I've spent some time to figure out the solution, so I'd like to share a more detailed solution.
Like Aayush Mall said, you'll need the PersistentVolume and PersistentVolumeClaim objects to create the volume and then link it to the pod (preferably via a Deployment object).
Basically, The PersistentVolume would define how and where the volume would be stored in the host and the PersistentVolumeClaim would define the constraints to bind the volume to some container.
From the docs:
A PersistentVolume (PV) is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes. It is a resource in the cluster just like a node is a cluster resource. PVs are volume plugins like Volumes, but have a lifecycle independent of any individual Pod that uses the PV. This API object captures the details of the implementation of the storage, be that NFS, iSCSI, or a cloud-provider-specific storage system.
A PersistentVolumeClaim (PVC) is a request for storage by a user. It is similar to a Pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany or ReadWriteMany, see AccessModes).
So, let's say your pods are running in two nodes: mynode-1 and mynode-2.
Your PersistentVolume spec will look like this.
apiVersion: v1
kind: PersistentVolume
metadata:
name: myapp-log-pv
spec:
capacity:
storage: 10Gi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
storageClassName: local-storage
local:
path: /var/log/myapp
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- mynode-1
- mynode-2
Your PersistentVolumeClaim like this.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: myapp-log-pvc
spec:
volumeMode: Filesystem
accessModes:
- ReadWriteMany
storageClassName: local-storage
resources:
requests:
storage: 2Gi
volumeName: myapp-log
And then, you just have to tell the deployment object how to mount the volume inside the container. So, your Deployment spec will look like this.
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-deploy
spec:
selector:
matchLabels:
app: myapp
replicas: 1
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: myrepo/myapp:latest
volumeMounts:
- name: log
mountPath: /var/log
volumes:
- name: log
persistentVolumeClaim:
claimName: myapp-log-pvc
And that's it. When your deployment starts, it'll create the pod with the container, mount a volume named log for the path /var/log (inside the container) and bound this volume to some PV matching the requirements of the PVC named myapp-log-pvc. As we've created the myapp-log-pv with the same volumeMode, accessModes and storageClassName fields and with more storage capacity then the required by myapp-log-pvc, they will be bound. So, your app logs will be stored in the path /var/log/myapp (field spec.local.path in the myapp-log-pv spec) inside the node running the pod.
I hope it help :)
Also, I'm kinda new in the kubernetes world, so please let me know if you notice I misunderstood something or if there is a better way to do this.

Why can you set multiple accessModes on a persistent volume?

For example in the following example:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: exmaple-pvc
spec:
accessModes:
- ReadOnlyMany
- ReadWriteMany
storageClassName: standard
volumeMode: Filesystem
resources:
requests:
storage: 1Gi
Why is this allowed? What is the actual behavior of the volume in this case? Read only? Read and write?
To be able to fully understand why a certain structure is used in a specific field of yaml definition, first we need to understand the purpose of this particular field. We need to ask what it is for, what is its function in this particular kubernetes api-resource.
I struggled a bit finding the proper explanation of accessModes in PersistentVolumeClaim and I must admit that what I found in official kubernetes docs did not safisfy me:
A PersistentVolume can be mounted on a host in any way supported by
the resource provider. As shown in the table below, providers will
have different capabilities and each PV’s access modes are set to the
specific modes supported by that particular volume. For example, NFS
can support multiple read/write clients, but a specific NFS PV might
be exported on the server as read-only. Each PV gets its own set of
access modes describing that specific PV’s capabilities.
Fortunately this time I managed to find really great explanation of this topic in openshift documentation. We can read there:
Claims are matched to volumes with similar access modes. The only two
matching criteria are access modes and size. A claim’s access modes
represent a request. Therefore, you might be granted more, but never
less. For example, if a claim requests RWO, but the only volume
available is an NFS PV (RWO+ROX+RWX), the claim would then match NFS
because it supports RWO.
Direct matches are always attempted first. The volume’s modes must
match or contain more modes than you requested. The size must be
greater than or equal to what is expected. If two types of volumes,
such as NFS and iSCSI, have the same set of access modes, either of
them can match a claim with those modes. There is no ordering between
types of volumes and no way to choose one type over another.
All volumes with the same modes are grouped, and then sorted by size,
smallest to largest. The binder gets the group with matching modes and
iterates over each, in size order, until one size matches.
And now probably the most important part:
A volume’s AccessModes are descriptors of the volume’s
capabilities. They are not enforced constraints. The storage provider
is responsible for runtime errors resulting from invalid use of the
resource.
I emphasized this part as AccessModes can be very easily misunderstood. Let's look at the example:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: exmaple-pvc-2
spec:
accessModes:
- ReadOnlyMany
storageClassName: standard
volumeMode: Filesystem
resources:
requests:
storage: 1Gi
The fact that we specified in our PersistentVolumeClaim definition only ReadOnlyMany access mode doesn't mean it cannot be used in other accessModes supported by our storage provider. It's important to understand that we cannot put here any constraint on how the requested storage can be used by our Pods. If our storage provider, hidden behind our standard storage class, supports also ReadWriteOnce, it will be also available for use.
Answering your particular question...
Why is this allowed? What is the actual behavior of the volume in this
case? Read only? Read and write?
It doesn't define behavior of the volume at all. The volume will behave according to its capabilities (we don't define them, they are imposed in advance, being part of the storage specification). In other words we will be able to use it in our Pods in all possible ways, in which it is allowed to be used.
Let's say our standard storage provisioner, which in case of GKE happens to be Google Compute Engine Persistent Disk:
$ kubectl get storageclass
NAME PROVISIONER AGE
standard (default) kubernetes.io/gce-pd 10d
currently supports two AccessModes:
ReadWriteOnce
ReadOnlyMany
So we can use all of them, no matter what we specified in our claim e.g. this way:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
labels:
app: my-app
spec:
replicas: 1
selector:
matchLabels:
app: debian
template:
metadata:
labels:
app: debian
spec:
containers:
- name: debian
image: debian
command: ['sh', '-c', 'sleep 3600']
volumeMounts:
- mountPath: "/mnt"
name: my-volume
readOnly: true
volumes:
- name: my-volume
persistentVolumeClaim:
claimName: example-pvc-2
initContainers:
- name: init-myservice
image: busybox
command: ['sh', '-c', 'echo "Content of my file" > /mnt/my_file']
volumeMounts:
- mountPath: "/mnt"
name: my-volume
In the above example both capabilities are used. First our volume is mounted in rw mode by the init container which saves to it some file and after that it is mounted to the main container as read-only file system. We are still able to do it even though we specified in our PersistentVolumeClaim only one access mode:
spec:
accessModes:
- ReadOnlyMany
Going back to the question you asked in the title:
Why can you set multiple accessModes on a persistent volume?
the answer is: You cannot set them at all as they are already set by the storage provider, you can only request this way what storage you want, what requirements it must meet and one of these requirements are access modes it supports.
Basically by typing:
spec:
accessModes:
- ReadOnlyMany
- ReadWriteOnce
in our PersistentVolulmeClaim definition we say:
"Hey! Storage provider! Give me a volume that supports this set of accessModes. I don't care if it supports any others, like ReadWriteMany, as I don't need them. Give me something that meets my requirements!"
I believe that further explanation why an array is used here is not needed.
A persistent volume can be mounted by multiple pods on the different node at the same. One pod can mount a persistent volume with only one access mode at a time and other pods can mount the same persistent volume with different access mode. But a pod can mount the persistent volume with only one access mode.
Documentation reference for those who didn't understand the question: persistent volume access modes

Kubernetes trouble with StatefulSet and 3 PersistentVolumes

I'm in the process of creating a StatefulSet based on this yaml, that will have 3 replicas. I want each of the 3 pods to connect to a different PersistentVolume.
For the persistent volume I'm using 3 objects that look like this, with only the name changed (pvvolume, pvvolume2, pvvolume3):
kind: PersistentVolume
apiVersion: v1
metadata:
name: pvvolume
labels:
type: local
spec:
storageClassName: standard
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/nfs"
claimRef:
kind: PersistentVolumeClaim
namespace: default
name: mongo-persistent-storage-mongo-0
The first of the 3 pods in the StatefulSet seems to be created without issue.
The second fails with the error pod has unbound PersistentVolumeClaims
Back-off restarting failed container.
Yet if I go to the tab showing PersistentVolumeClaims the second one that was created seems to have been successful.
If it was successful why does the pod think it failed?
I want each of the 3 pods to connect to a different PersistentVolume.
For that to work properly you will either need:
provisioner (in link you posted there are example how to set provisioner on aws, azure, googlecloud and minicube) or
volume capable of being mounted multiple times (such as nfs volume). Note however that in such a case all your pods read/write to the same folder and this can lead to issues when they are not meant to lock/write to same data concurrently. Usual use case for this is upload folder that pods are saving to, that is later used for reading only and such use cases. SQL Databases (such as mysql) on the other hand, are not meant to write to such shared folder.
Instead of either of mentioned requirements in your claim manifest you are using hostPath (pointing to /nfs) and set it to ReadWriteOnce (only one can use it). You are also using 'standard' as storage class and in url you gave there are fast and slow ones, so you probably created your storage class as well.
The second fails with the error pod has unbound PersistentVolumeClaims
Back-off restarting failed container
That is because first pod already took it's claim (read write once, host path) and second pod can't reuse same one if proper provisioner or access is not set up.
If it was successful why does the pod think it failed?
All PVC were successfully bound to accompanying PV. But you are never bounding second and third PVC to second or third pods. You are retrying with first claim on second pod, and first claim is already bound (to fist pod) in ReadWriteOnce mode and can't be bound to second pod as well and you are getting error...
Suggested approach
Since you reference /nfs as your host path, it may be safe to assume that you are using some kind of NFS-backed file system so here is one alternative setup that can get you to mount dynamically provisioned persistent volumes over nfs to as many pods in stateful set as you want
Notes:
This only answers original question of mounting persistent volumes across stateful set replicated pods with the assumption of nfs sharing.
NFS is not really advisable for dynamic data such as database. Usual use case is upload folder or moderate logging/backing up folder. Database (sql or no sql) is usually a no-no for nfs.
For mission/time critical applications you might want to time/stresstest carefully prior to taking this approach in production since both k8s and external pv are adding some layers/latency in-between. Although for some application this might suffice, be warned about it.
You have limited control of name for pv that are being dynamically created (k8s adds suffix to newly created, and reuses available old ones if told to do so), but k8s will keep them after pod get terminated and assign first available to new pod so you won't loose state/data. This is something you can control with policies though.
Steps:
for this to work you will first need to install nfs provisioner from here:
https://github.com/kubernetes-incubator/external-storage/tree/master/nfs. Mind you that installation is not complicated but has some steps where you have to take careful approach (permissions, setting up nfs shares etc) so it is not just fire-and-forget deployment. Take your time installing nfs provisioner correctly. Once this is properly set up you can continue with suggested manifests below:
Storage class manifest:
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
name: sc-nfs-persistent-volume
# if you changed this during provisioner installation, update also here
provisioner: example.com/nfs
Stateful Set (important excerpt only):
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: ss-my-app
spec:
replicas: 3
...
selector:
matchLabels:
app: my-app
tier: my-mongo-db
...
template:
metadata:
labels:
app: my-app
tier: my-mongo-db
spec:
...
containers:
- image: ...
...
volumeMounts:
- name: persistent-storage-mount
mountPath: /wherever/on/container/you/want/it/mounted
...
...
volumeClaimTemplates:
- metadata:
name: persistent-storage-mount
spec:
storageClassName: sc-nfs-persistent-volume
accessModes: [ ReadWriteOnce ]
resources:
requests:
storage: 10Gi
...

Can I rely on volumeClaimTemplates naming convention?

I want to setup a pre-defined PostgreSQL cluster in a bare meta kubernetes 1.7 with local PV enable. I have three work nodes. I create local PV on each node and deploy the stateful set successfully (with some complex script to setup Postgres replication).
However I'm noticed that there's a kind of naming convention between the volumeClaimTemplates and PersistentVolumeClaim.
For example
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: postgres
volumeClaimTemplates:
- metadata:
name: pgvolume
The created pvc are pgvolume-postgres-0, pgvolume-postgres-1, pgvolume-postgres-2 .
With some tricky, I manually create PVC and bind to the target PV by selector. I test the stateful set again. It seems the stateful set is very happy to use these PVC.
I finish my test successfully but I still have this question. Can I rely on volumeClaimTemplates naming convention? Is this an undocumented feature?
Based on the statefulset API reference
volumeClaimTemplates is a list of claims that pods are allowed to reference. The StatefulSet controller is responsible for mapping network identities to claims in a way that maintains the identity of a pod. Every claim in this list must have at least one matching (by name) volumeMount in one container in the template. A claim in this list takes precedence over any volumes in the template, with the same name.
So I guess you can rely on it.
Moreover, you can define a storage class to leverage dynamic provisioning of persistent volumes, so you won't have to create them manually.
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: my-storage-class
resources:
requests:
storage: 1Gi
Please refer to Dynamic Provisioning and Storage Classes in Kubernetes for more details.