Read-only folder to be shared to another pod - kubernetes

I have a pod that needs to create a lot of jobs.
I'd like to share a read-only folder.
How can I do it?
Several ideas I can imagine (I'm newbie to Kubernetes):
Ephemeral volumes seem a good choice, but I've read it cannot be shared with another pod.
I thihk NFS is an overkill, too much for my needs.
Maybe, I could build a data only Docker image, but this is a deprecated feature of Docker.
kubectl cp to copy the data between the base pod to the pod in the job.
What would be the better solution for this?

You can use a PersistentVolume and mount it as read only volume inside the pod via PersistentVolumeClaim. To mount a read only volume, set .spec.containers[*].volumeMounts[*].readOnly to true.
apiVersion: v1
kind: Pod
metadata:
name: mypod
spec:
containers:
- name: myfrontend
image: nginx
volumeMounts:
- mountPath: "/var/www/html"
name: mypd
readOnly: true
volumes:
- name: mypd
persistentVolumeClaim:
claimName: myclaim
Check out these links:
https://kubernetes.io/docs/concepts/storage/persistent-volumes/#persistentvolumeclaims
https://kubernetes.io/docs/concepts/storage/persistent-volumes/#persistent-volumes

Related

How can I mount the same persistent volume on multiple pods?

I have a three node GCE cluster and a single-pod GKE deployment with three replicas. I created the PV and PVC like so:
# Create a persistent volume for web content
apiVersion: v1
kind: PersistentVolume
metadata:
name: nginx-content
labels:
type: local
spec:
capacity:
storage: 5Gi
accessModes:
- ReadOnlyMany
hostPath:
path: "/usr/share/nginx/html"
--
# Request a persistent volume for web content
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: nginx-content-claim
annotations:
volume.alpha.kubernetes.io/storage-class: default
spec:
accessModes: [ReadOnlyMany]
resources:
requests:
storage: 5Gi
They are referenced in the container spec like so:
spec:
containers:
- image: launcher.gcr.io/google/nginx1
name: nginx-container
volumeMounts:
- name: nginx-content
mountPath: /usr/share/nginx/html
ports:
- containerPort: 80
volumes:
- name: nginx-content
persistentVolumeClaim:
claimName: nginx-content-claim
Even though I created the volumes as ReadOnlyMany, only one pod can mount the volume at any given time. The rest give "Error 400: RESOURCE_IN_USE_BY_ANOTHER_RESOURCE". How can I make it so all three replicas read the same web content from the same volume?
First I'd like to point out one fundamental discrapency in your configuration. Note that when you use your PersistentVolumeClaim defined as in your example, you don't use your nginx-content PersistentVolume at all. You can easily verify it by running:
kubectl get pv
on your GKE cluster. You'll notice that apart from your manually created nginx-content PV, there is another one, which was automatically provisioned based on the PVC that you applied.
Note that in your PersistentVolumeClaim definition you're explicitely referring the default storage class which has nothing to do with your manually created PV. Actually even if you completely omit the annotation:
annotations:
volume.alpha.kubernetes.io/storage-class: default
it will work exactly the same way, namely the default storage class will be used anyway. Using the default storage class on GKE means that GCE Persistent Disk will be used as your volume provisioner. You can read more about it here:
Volume implementations such as gcePersistentDisk are configured
through StorageClass resources. GKE creates a default StorageClass for
you which uses the standard persistent disk type (ext4). The default
StorageClass is used when a PersistentVolumeClaim doesn't specify a
StorageClassName. You can replace the provided default StorageClass
with your own.
But let's move on to the solution of the problem you're facing.
Solution:
First, I'd like to emphasize you don't have to use any NFS-like filesystems to achive your goal.
If you need your PersistentVolume to be available in ReadOnlyMany mode, GCE Persistent Disk is a perfect solution that entirely meets your requirements.
It can be mounted in ro mode by many Pods at the same time and what is even more important by many Pods, scheduled on different GKE nodes. Furthermore it's really simple to configure and it works on GKE out of the box.
In case you want to use your storage in ReadWriteMany mode, I agree that something like NFS may be the only solution as GCE Persistent Disk doesn't provide such capability.
Let's take a closer look how we can configure it.
We need to start from defining our PVC. This step was actually already done by yourself but you got lost a bit in further steps. Let me explain how it works.
The following configuration is correct (as I mentioned annotations section can be omitted):
# Request a persistent volume for web content
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: nginx-content-claim
spec:
accessModes: [ReadOnlyMany]
resources:
requests:
storage: 5Gi
However I'd like to add one important comment to this. You said:
Even though I created the volumes as ReadOnlyMany, only one pod can
mount the volume at any given time.
Well, actually you didn't. I know it may seem a bit tricky and somewhat surprising but this is not the way how defining accessModes really works. In fact it's a widely misunderstood concept. First of all you cannot define access modes in PVC in a sense of putting there the constraints you want. Supported access modes are inherent feature of a particular storage type. They are already defined by the storage provider.
What you actually do in PVC definition is requesting a PV that supports the particular access mode or access modes. Note that it's in a form of a list which means you may provide many different access modes that you want your PV to support.
Basically it's like saying: "Hey! Storage provider! Give me a volume that supports ReadOnlyMany mode." You're asking this way for a storage that will satisfy your requirements. Keep in mind however that you can be given more than you ask. And this is also our scenario when asking for a PV that supports ReadOnlyMany mode in GCP. It creates for us a PersistentVolume which meets our requirements we listed in accessModes section but it also supports ReadWriteOnce mode. Although we didn't ask for something that also supports ReadWriteOnce you will probably agree with me that storage which has a built-in support for those two modes fully satisfies our request for something that supports ReadOnlyMany. So basically this is the way it works.
Your PV that was automatically provisioned by GCP in response for your PVC supports those two accessModes and if you don't specify explicitely in Pod or Deployment definition that you want to mount it in read-only mode, by default it is mounted in read-write mode.
You can easily verify it by attaching to the Pod that was able to successfully mount the PersistentVolume:
kubectl exec -ti pod-name -- /bin/bash
and trying to write something on the mounted filesystem.
The error message you get:
"Error 400: RESOURCE_IN_USE_BY_ANOTHER_RESOURCE"
concerns specifically GCE Persistent Disk that is already mounted by one GKE node in ReadWriteOnce mode and it cannot be mounted by another node on which the rest of your Pods were scheduled.
If you want it to be mounted in ReadOnlyMany mode, you need to specify it explicitely in your Deployment definition by adding readOnly: true statement in the volumes section under Pod's template specification like below:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
volumeMounts:
- mountPath: "/usr/share/nginx/html"
name: nginx-content
volumes:
- name: nginx-content
persistentVolumeClaim:
claimName: nginx-content-claim
readOnly: true
Keep in mind however that to be able to mount it in readOnly mode, first we need to pre-populate such volume with data. Otherwise you'll see another error message, saying that unformatted volume cannot be mounted in read only mode.
The easiest way to do it is by creating a single Pod which will serve only for copying data which was already uploaded to one of our GKE nodes to our destination PV.
Note that pre-populating PersistentVolume with data can be done in many different ways. You can mount in such Pod only your PersistentVolume that you will be using in your Deployment and get your data using curl or wget from some external location saving it directly on your destination PV. It's up to you.
In my example I'm showing how to do it using additional local volume that allows us to mount into our Pod a directory, partition or disk (in my example I use a directory /var/tmp/test located on one of my GKE nodes) available on one of our kubernetes nodes. It's much more flexible solution than hostPath as we don't have to care about scheduling such Pod to particular node, that contains the data. Specific node affinity rule is already defined in PersistentVolume and Pod is automatically scheduled on specific node.
To create it we need 3 things:
StorageClass:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
PersistentVolume definition:
apiVersion: v1
kind: PersistentVolume
metadata:
name: example-pv
spec:
capacity:
storage: 10Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Delete
storageClassName: local-storage
local:
path: /var/tmp/test
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- <gke-node-name>
and finally PersistentVolumeClaim:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: myclaim
spec:
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
resources:
requests:
storage: 10Gi
storageClassName: local-storage
Then we can create our temporary Pod which will serve only for copying data from our GKE node to our GCE Persistent Disk.
apiVersion: v1
kind: Pod
metadata:
name: mypod
spec:
containers:
- name: myfrontend
image: nginx
volumeMounts:
- mountPath: "/mnt/source"
name: mypd
- mountPath: "/mnt/destination"
name: nginx-content
volumes:
- name: mypd
persistentVolumeClaim:
claimName: myclaim
- name: nginx-content
persistentVolumeClaim:
claimName: nginx-content-claim
Paths you can see above are not really important. The task of this Pod is only to allow us to copy our data to the destination PV. Eventually our PV will be mounted in completely different path.
Once the Pod is created and both volumes are successfully mounted, we can attach to it by running:
kubectl exec -ti my-pod -- /bin/bash
Withing the Pod simply run:
cp /mnt/source/* /mnt/destination/
That's all. Now we can exit and delete our temporary Pod:
kubectl delete pod mypod
Once it is gone, we can apply our Deployment and our PersistentVolume finally can be mounted in readOnly mode by all the Pods located on various GKE nodes:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
volumeMounts:
- mountPath: "/usr/share/nginx/html"
name: nginx-content
volumes:
- name: nginx-content
persistentVolumeClaim:
claimName: nginx-content-claim
readOnly: true
Btw. if you are ok with the fact that your Pods will be scheduled only on one particular node, you can give up on using GCE Persistent Disk at all and switch to the above mentioned local volume. This way all your Pods will be able not only to read from it but also to write to it at the same time. The only caveat is that all those Pods will be running on a single node.
You can achieve this with a NFS like file system. On Google Cloud, Filestore is the right product for this (NFS managed). You have a tutorial here for achieving your configuration
You will need to use a shared volume claim with ReadWriteMany (RWX) type if you want to share the volume across different nodes and provide highly scalable solution. Like using NFS server.
You can find out how to deploy an NFS server here:
https://www.shebanglabs.io/run-nfs-server-on-ubuntu-20-04/
And then you can mount volumes (directories from NFS server) as follows:
https://www.shebanglabs.io/how-to-set-up-read-write-many-rwx-persistent-volumes-with-nfs-on-kubernetes/
I've used such a way to deliver shared static content between +8 k8s deployments (+200 pods) serving 1 Billion requests a month over Nginx. and it did work perfectly with that NFS setup :)
Google provides NFS like filesystem called as Google Cloud Filestore. You can mount that on multiple pods.

Share filesystem across containers in a pod

Is there a way to share the filesystem of two containers in a multi-container pod? without using shared volumes?
I have following pod manifest
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: pod
name: pod
spec:
replicas: 1
selector:
matchLabels:
app: pod
template:
metadata:
labels:
app: pod
spec:
containers:
- image: nginx:latest
name: nginx
- image: jenkins
name: jenkins
I want to access /var/jenkins_home path which is available in jenkins container from nginx container.
This is just for experimental purposes, I am trying to learn ways to share filesystem/things in general across containers in a pod.
You can't share files between containers without some sort of shared volume.
Part of the goal of a containerized system is that the container filesystems are isolated from each other. There are a huge number of practical problems with sharing files specifically (what if the containers are on different nodes? what if you have three replicas each of Jenkins and Nginx? what if they're all trying to write the same files?) and in general it's better to just avoid sharing files altogether if that's a possibility.
In the specific example you've shown, the lifecycle of a Jenkins CI system and an Nginx server will just be fundamentally different; whenever Jenkins builds something you don't want to restart it to also restart the Web server, and you could very easily want to scale up the Web tier without adding Jenkins workers. A better approach here would be to have Jenkins generate custom Docker images, push them to a registry, and then use the Kubernetes API to create a separate Nginx Deployment.
In most cases (especially because of the scaling considerations) you should avoid multi-container pods altogether.
(A more specific example of a case where this setup does make sense is if you're storing credentials somewhere like a Hashicorp Vault server. You would need an init container to connect to Vault, retrieve the credentials, and deposit them in an emptyDir volume, and then the main container can start up having gotten those credentials. As far as the main server container is concerned it's the only important part of this pod, and logically the pod is nothing more than the server container with some auxiliary stuff.)
Below sample would help you how to share volume between cobtainers
apiVersion: v1
kind: Pod
metadata:
name: two-containers
spec:
restartPolicy: Never
volumes:
- name: shared-data
emptyDir: {}
containers:
- name: nginx-container
image: nginx
volumeMounts:
- name: shared-data
mountPath: /usr/share/nginx/html
- name: debian-container
image: debian
volumeMounts:
- name: shared-data
mountPath: /pod-data
command: ["/bin/sh"]
args: ["-c", "echo Hello from the debian container > /pod-data/index.html"]

Is it possible to mount a PV directly without PVC?

So far I was convinced that one need a PVC to access a PV like in this example from k8s doc:
apiVersion: v1
kind: Pod
metadata:
name: mypod
spec:
containers:
- name: myfrontend
image: nginx
volumeMounts:
- mountPath: "/var/www/html"
name: mypd
volumes:
- name: mypd
persistentVolumeClaim:
claimName: myclaim
But then I saw in Docker doc that one can use the following syntax (example using nfs):
kind: Pod
apiVersion: v1
metadata:
name: nfs-in-a-pod
spec:
containers:
- name: app
image: alpine
volumeMounts:
- name: nfs-volume
mountPath: /var/nfs # Please change the destination you like the share to be mounted too
command: ["/bin/sh"]
args: ["-c", "sleep 500000"]
volumes:
- name: nfs-volume
nfs:
server: nfs.example.com # Please change this to your NFS server
path: /share1 # Please change this to the relevant share
I am confused:
Is this syntax creating a PVC under the hood?
Or is any PV matching the spec mounted without a PVC?
Or perhaps the spec selects an existing PVC?
The various kinds of things you can mount are part of the Volume object in the Kubernetes API (which is part of a PodSpec, which is part of a Pod). None of these are an option to mount a specific PersistentVolume by name.
(There are some special cases you can see there for things like NFS and various clustered storage systems. Those mostly predate persistent volumes.)
The best you can do here is to create a PVC that's very tightly bound to a single persistent volume, and then reference that in the pod spec.
An emptyDir volume is first created when a Pod is assigned to a Node, and exists as long as that Pod is running on that node.
You dont need pv and pvc for emptyDIr volume.
Note that when a Pod is removed from a node for any reason, the data in the emptyDir is deleted forever.
If you want to retain the data even if the pod crashes or restarts or the pod is deleted or undeployed then you need to use pv and pvc
Look at another example below, where you dont need pv and pvc using hostPath
apiVersion: v1
kind: Pod
metadata:
name: test-pd
spec:
containers:
- image: k8s.gcr.io/test-webserver
name: test-container
volumeMounts:
- mountPath: /test-pd
name: test-volume
volumes:
- name: test-volume
hostPath:
# directory location on host
path: /data
# this field is optional
type: Directory
If you need to store the data on external storage solutions like nfs, azure file storage, aws EBS, google persistentDisk etc then you need to create pv and pvc.
mounting pv directly to a pod is not allowed and is against the kubernetes design principles. It would cause tight coupling below the pod vloume and the underlysing storage.
pvc enables light coupling between the pod and the persistent volume. The pod
doesnt know what the underlying storage is used to store the container data and is not necessary for the pod to know that info.
pv and pvc are required for static and dynamic provisioning of storage volumes for work loads in kubernetes cluster

Local Persistent Volume in its own directory

I have got the local persistent volumes to work, using local directories as mount points, storage class, PVC etc, all using standard documentation.
However, when I use this PVC in a Pod, all the files are getting created in the base of the mount point, i.e if /data is my mount point, all my application files are stored in the /data folder. I see this creating conflicts in the future, with more than one application writing to the same folder.
Looking for any suggestions or advice to make each PVC or even application files of a Pod into separate directories in the PV.
If you store your data in different directories on your volume, you can use subPath to separate your data into different directories using multiple mount points.
E.g.
apiVersion: v1
kind: Pod
metadata:
name: podname
spec:
containers:
- name: containername
image: imagename
volumeMounts:
- mountPath: /path/to/mount/point
name: volumename
subPath: volume_subpath
- mountPath: /path/to/mount/point2
name: volumename
subPath: volume_subpath2
volumes:
- name: volumename
persistentVolumeClaim:
claimName: pvcname
Another approach is using subPathExpr.
Note:
The subPath and subPathExpr properties are mutually exclusive
apiVersion: v1
kind: Pod
metadata:
name: pod3
spec:
containers:
- name: pod3
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
image: busybox
command: [ "sh", "-c", "while [ true ]; do echo 'Hello'; sleep 10; done | tee -a /logs/hello.txt" ]
volumeMounts:
- name: workdir1
mountPath: /logs
subPathExpr: $(POD_NAME)
restartPolicy: Never
volumes:
- name: workdir1
persistentVolumeClaim:
claimName: pvc1
As described here.
In addition please follow Fixing the Subpath Volume Vulnerability in Kubernetes here and here
You can simply change the mount path and sperate the each application mount path so that files of POD into separate directories.

Creating Google persistent disks from snapshots in Kubernetes

I need to run pods on multiple nodes with very large (700GB) readonly dataset in Kubernetes. I tried using readonlymany, but it fails in multi-node setup, and in general was very unstable.
Is there a way for pods to create a new persistent disk from a snapshot, attach it to the pod, and destroy it when pod is destroyed? This would allow me to update snapshots once in a while with the new data.
You can manually provision a persistent disk using an existing image on GCP:
gcloud beta compute disks create --size=500GB --image=<snapshot-name> my-data-disk
Then use it on your pod:
apiVersion: v1
kind: Pod
metadata:
name: test-pd
spec:
containers:
- image: k8s.gcr.io/test-webserver
name: test-container
volumeMounts:
- mountPath: /test-pd
name: test-volume
volumes:
- name: test-volume
# This GCE PD must already exist.
gcePersistentDisk:
pdName: my-data-disk
fsType: ext4
The GCE storage class doesn't support snapshots so unfortunately, you can't do it with PVCs. More info here
Hope it helps.