No matter how many times I read the documentation I just don't get it, so apologies for the really basic question.
I read that once a PersistentVolume is claimed, no other Pod can claim it - claims are exclusive.
However PV accessmodes have options including *Many. These two seem to contradict each other.
What is the Once or the Many in the access mode types? Does it refer to multiple replicas of the same pod across different nodes. Or does it mean after one claim has been released, can another pod then claim it? Or does it refer to the underlying storage which could be referenced by a different PV? Or something else?
I read that once a PersistentVolume is claimed, no other Pod can claim it - claims are exclusive.
This is a misunderstanding. It should be: once a PersistentVolume is claimed, no other PersistentVolumeClaim can claim it - claims are exclusive.
But multiple Pods can use the same PersistentVolumeClaim - it is not so common - but this is typically what happens when you "upgrade" your application, both new and old version of your app might use the PVC for a short time.
Access Modes
Access Modes on Persistent Volumes is related to how the volumes can be mounted on nodes. This is related to how your storage system works, so you must check what access modes is available for your storage system.
The modes ending with -Once can only be mounted on a single node at a time - this is unrelated to Pods. The mode ending with -Many can be mounted on multiple nodes at the same time, typical for NFS-style storage systems.
Related
I have a GKE cluster, with almost 6-7 micro-services deployed. I need a Postgres DB to be installed inside GKE (not Cloudsql as cost). When checked the different types of persistent volumes i can see that if multiple micro-service accessing the same DB, should i go using NFS or PVC with normal disk would be enough not anyway local storage.
Request your thought on this.
Everything depends from your scenario. In general you should follow AccessMode when you are considering which Volume Plugin you want to use.
A PersistentVolume can be mounted on a host in any way supported by the resource provider. As shown in the table below, providers will have different capabilities and each PV's access modes are set to the specific modes supported by that particular volume.
In this documentation below, you will find table with different Volume Plugins and supported Access Modes.
According to update form your comment, you have only one node. With that setup, you can use almost every Volume which supports RWO Access mode.
ReadWriteOnce -- the volume can be mounted as read-write by a single node.
There are 2 other Access Modes which should be consider if would like to use it for more than 1 node.
ReadOnlyMany -- the volume can be mounted read-only by many nodes
ReadWriteMany -- the volume can be mounted as read-write by many nodes
So in your case you can use gcePersistentDisk as it supports (ReadWriteOnce and ReadOnlyMany).
Using NFS would benefit if you would like to access this PV from many nodes.
NFS can support multiple read/write clients, but a specific NFS PV might be exported on the server as read-only. Each PV gets its own set of access modes describing that specific PV's capabilities.
Just as addition, if this is for learning puropse, you can also check Local Persistent Volume. Example can be found in this tutorial, however it would require few updates like image or apiVersion.
We have a Storageclass with an nfs provisioner and a reclaimPolicy: Delete. I created a separated StorageClass with the reclaimPolicy: Retain because Kafka needs to have the data persisted. One coworker told me that if I create a second StorageClass with the same provisioning source kubernetes can confuses the volumes and could overwrite data in the wrong volume. He recommends to declare the "reclaimPolicy: Retain", doing manually the PersistentVolumes. Using the already declared initial StorageClass.
I can not find this supposedly bad effect of using more than one StorageClass for the same provisioner. In fact, after reading the official k8s documentation I have the feeling that recommends the opposite thing:
https://kubernetes.io/docs/concepts/storage/storage-classes/#introduction
A StorageClass provides a way for administrators to describe the “classes” of storage they offer. Different classes might map to quality-of-service levels, or to backup policies, or to arbitrary policies determined by the cluster administrators. Kubernetes itself is unopinionated about what classes represent. This concept is sometimes called “profiles” in other storage systems
Yes, you can create multiple storage classes for same storage provider.
You can create one for 'reclaimPolicy: Delete' and second one for 'reclaimPolicy: Retain'
I know that PVC can be used as a volume in k8s. I know how to create them and how to use, but I couldn't understand why there are two of them, PV and PVC.
Can someone give me an architectural reason behind PV/PVC distinction? What kind of problem it try to solve (or what historical is behind this)?
Despite their names, they serve two different purposes: an abstraction for storage (PV) and a request for such storage (PVC). Together, they enable a clean separation of concerns (using a figure from our Kubernetes Cookbook here to illustrate this):
The storage admin focuses on provisioning PVs (ideally dynamically through defining storage classes) and the developer uses a PVC to acquire a PV and use it in a pod.
It is easy to be thrown by the names but the kubernetes documentation does have an explanation of the difference:
A PersistentVolume (PV) is a piece of storage in the cluster that has been provisioned by an administrator. It is a resource in the cluster just like a node is a cluster resource. PVs are volume plugins like Volumes, but have a lifecycle independent of any individual pod that uses the PV.
And
A PersistentVolumeClaim (PVC) is a request for storage by a user. It is similar to a pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (e.g., can be mounted once read/write or many times read-only).
So the PVC decouples the application from the specific storage. It allows the application to say that it needs some storage satisfying certain requirements without saying specifically which piece of storage that is. This also makes it possible for cluster-level rules to be defined on how the storage requirements of apps are to be met.
I've previously used both types, I've also read through the docs at:
https://kubernetes.io/docs/concepts/storage/persistent-volumes/
https://kubernetes.io/docs/concepts/storage/volumes/
However it's still not clear what the difference is, both seem to support the same storage types, the only thing that comes to mind is there seems to be a 'provisioning' aspect to persistent volumes.
What is the practical difference?
Are there advantages / disadvantages between the two - or for what use case would one be better suited to than the other?
Is it perhaps just 'synctactic sugar'?
For example NFS could be mounted as a volume, or a persistent volume. Both require a NFS server, both will have it's data 'persisted' between mounts. What difference would be had in this situation?
Volume decouples the storage from the Container. Its lifecycle is coupled to a pod. It enables safe container restarts and sharing data between containers in a pod.
Persistent Volume decouples the storage from the Pod. Its lifecycle is independent. It enables safe pod restarts and sharing data between pods.
A volume exists in the context of a pod, that is, you can't create a volume on its own. A persistent volume on the other hand is a first class object with its own lifecycle, which you can either manage manually or automatically.
The way I understand it is that the concept of a Persistent Volumes builds on that of a Volume and that the difference is that a Persistent Volume is more decoupled from Pods using it. Or as expressed in the introduction of the documentation page about Persistent Volumes:
PVs are volume plugins like Volumes, but have a lifecycle independent of any individual pod that uses the PV.
A Volume's lifecycle on the other hand depends on the lifecycle of the Pod using it:
A Kubernetes volume [...] has an explicit lifetime - the same as the Pod that encloses it.
NFS is not really relevant here. Both Volumes and Persistent Volumes are Kubernetes resources. They provide an abstraction of a data storage facility. So for using the cluster, it doesn't matter which concrete operating system resource is behind that abstraction. That's in a way the whole point of Kubernetes.
It might also be relevant here to keep in mind that Kubernetes and its API are still evolving. The Kubernetes developers might sometimes choose to introduce new concepts/resources that differ only subtly from existing ones. I presume one reason for this is to maintain backwards compatibility while still being able to fine tune basic API concepts. Another example for this are Replication Controllers and Replica Sets, which conceptually largely overlap and are therefore redundant to some extent. Although, what's different to the Volume/Persitent Volume matter is that Replication Controllers are explicitly deprecated now.
Volumes ≠ Persistent Volumes
Volumes and Persistent Volumes are related, but very different!
Volumes:
appear in Pod specifications
do not exist as API resources (cannot do kubectl get volumes)
Persistent Volumes:
are API resources (can do kubectl get persistentvolumes)
correspond to concrete volumes (e.g. on a SAN, EBS, etc.)
cannot be associated with a Pod directly
(they need a Persistent Volume Claim)
They are two different implementations which can provide some similar common functionality (hence a lot of confusion).
Persistent volumes:
Support storage provisioned via StorageClass
Does not support emptyDir volume type (https://github.com/kubernetes/kubernetes/issues/75378)
Volumes:
Are bound to a pod
Are simpler to define (less Kubernetes resources required)
What is the difference between persistent volume (PV) and persistent volume claim (PVC) in Kubernetes/ Openshift by referring to documentation?
What is the difference between both in simple terms?
From the docs
PVs are resources in the cluster. PVCs are requests for those resources and also act as claim checks to the resource.
So a persistent volume (PV) is the "physical" volume on the host machine that stores your persistent data. A persistent volume claim (PVC) is a request for the platform to create a PV for you, and you attach PVs to your pods via a PVC.
Something akin to
Pod -> PVC -> PV -> Host machine
PVC is a declaration of need for storage that can at some point become available / satisfied - as in bound to some actual PV.
It is a bit like the asynchronous programming concept of a promise. PVC promises that it will at some point "translate" into storage volume that your application will be able to use, and one of defined characteristics like class, size, and access mode (ROX, RWO, and RWX).
This is a way to abstract thinking about a particular storage implementation away from your pods/deployments. Your application in most cases does not need to declare "give me NFS storage from server X of size Y"; it is more like "I need persistent storage of default class and size Y".
With this, deployments on different clusters can choose to differently satisfy that need. One can link an EBS device, another can provision a GlusterFS, and your core manifests are still the same in both cases.
Furthermore, you can have Volume Claim Templates defined in your deployment, so that each pod gets a reflecting PVC created automatically (i.e., supporting infrastructure-agnostic storage definition for a group of scalable pods where each needs its own dedicated storage).
Short:
- Here you have the storage! PersistentVolume (PV)
- You get the storage if you really need it! PersistentVolumeClaim (PVC)
A PersistentVolume (PV) is a piece of storage in the cluster or central storage let's say 100GB.
A PersistentVolumeClaim (PVC) is a request for storage by a user for the application to use 10GB.
In real life scenario, PV is whole cake and PVC is piece of cake (But you can have a whole cake if there are no other people to eat (just like if there are no other application to use you can use whole PV )).
Short and Simple
Persistent Volume - Available storage let's say you have 100Gi
Persistent Volume Claim - You request from Persistent Volume, let's say you request 10Gi you'll get it but if you request 110Gi you won't get it.
A PersistentVolume (PV) is a piece of storage in the cluster that has been provisioned by server/storage/cluster administrator or dynamically provisioned using Storage Classes. It is a resource in the cluster just like node.
A PersistentVolumeClaim (PVC) is a request for storage by a user which can be attained from PV. It is similar to a Pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany or ReadWriteMany.
A Persistent Volume Claim is telling you what options you have access to in a particular cluster and they got this circular at this store called Smart Tech with some ads about your configuration options, those ads are the Persistent Volume Claim.
Inside your config file you write out the different Persistent Volume Claims that you are going to have inside your cluster, kind of like your wish list to Santa, but of course you are going to go take that to the sales guy at Smart Tech when you are done.
So you write a config file that says there should a 600gb hard drive option available to all your clusters and a 1TB hard drive option as well.
When you choose one of these options of the Persistent Volume Claim you go and request that Kubernetes (the sales guy) goes and gets that option for you, the option you have chosen, Kubernetes has to look through these instances of storage options in the stock room that are readily available. These instances of hard drives can be used right away and they are considered statically provisioned because they are created ahead of time.
On the other hand, there is dynamically provisioned options that were created on the fly, when you asked Kubernetes the sales guy, so kind of like just-in-time production, it got created when you immediately asked for it.
So the Persistent Volume Claim is the stores advertisement of options and whichever one you choose Kubernetes will go get it, either one in storage or create one on the fly.
The Persistent Volume is the actual product or options that you get back from Kubernetes that you asked for. If Kubernetes does not have what you asked for it will try to create it on the fly for you.
So the PVC is what Smart Tech is advertising they have to offer to your cluster which Kubernetes the sales guy will get for you and the PV is the actual finished product delivered to you.
PersistentVolume(PV) and PersistentVolumeClaim(PVC) are the resources APIs provided by the Kubernetes.
PV is a piece of storage which supposed to preallocated by an admin. And PVC is a request for a piece of storage by a user.
Persistent Volume — low level representation of a storage volume.
Persistent Volume Claim — binding between a Pod and Persistent Volume.
Storage Class — allows for dynamic provisioning of Persistent Volumes.
You can find some common when comparing PV and PVC with node and pods.
PV like a node, which defines the storage.
PVC like pods that requires the resources (Mem, CPU) and get them in case the node has the resources to allocate, which in this case it's a storage.