ReadWriteMany PVC in kubernetes with external/local access - kubernetes

I need some kind of advise on my problem as I cannot find a suitable solution.
I have a k8s cluster with several nodes and Directus deployed in several pods. My developers want to extend Directus with custom extensions. This is done by uploading these source files in the /extension folder.
This means every pod needs to share the /extension folder to access the files. So my thought was using a shared pvc.
Basically I can setup a NFS pvc wirh rwx to be shared between pods and mounted as /extension.
BUT: How can I deploy the source code and folder structure on this pvc? So I would need to either accesss the FS externally via local mount OR via Github actions to deploy code changes. Jut NFS does not support any auth method so I would open the gate of hell if I access the NFS port outside the privat network.
Is there any other pvc rwx storage solution that could be used also with at least local access options?
I would create the pvc, access it via kubectl, buimd the folder structure as needed from Directus, push code via kubectl cp. Jut this seems a mess in production.

In the meantime I proceed with the following stack:
NFS pod mounts block storage PV RWO and provides NFS PVC to cluster
Directus mounts NFS PVC at /directus/extensions
Filebrowser mounts NFS PVC at /srv
So basically I used filebrowser:filebrowser (Github) container to serve the NFS pvc content (=directus extensions folder) to developers over HTTPS interface. This enables them to upload new files manually directly on the NFS mount that would be picked up by the App.
This seems propriate for development phase but I doubt this could work in production phase. Reasons for this:
There is no integration in CI/CD possible
Restart of filebrowser container requires manual interaction to secure the pod as they don't provide .env config for proper k8s deployment
So I am still looking at solutions to push code changes onto a Kubernetes NFS mount. Any dockerized service in mind?!

Related

How to pre-populate EFS with required files for starting containers in my pods in EKS

I am moving my stateful flask application from single node server to EKS. I have decided to go with AWS EFS as my persistent volume option. The application needs few files (configs, trained ML models etc.) to start so my questions is -
Can I pre-populate EFS with required files before I apply K8 deployment yaml with PV, PVC?
If yes, then how do I place files exactly at the mountPath where my container can access it? This is a concern because I haven't applied the deployment yaml yet.

Helm + Kubernetes upload large file ~30-80 MB to cluster and mount it to pods

I have helm + kubernetes setup. I need to store large file ~30-80 MB in cluster and mount it to pods. How do I achieve this, so that I don't manually upload the file to every environment?
You can share common files using NFS. There are many ways to use NFS with K8s such as this one. If your cluster is managed by cloud provider such as AWS, you can consider EFS which is NFS compatible. NFS compatible solution on cloud platform is very common today. This way you never need to manually upload files to worker nodes. Your helm chart will focus on create the necessary PersistentVolumeClaim/PersistentVolume and volume mount to access the shared files.
One way to do this is to use a helm install+upgrade hook AND an init container.
Set a helm install hook to create a kubernetes job that will download the file to the mounted volume.
The init container on the pod will wait indefinitely until the download is complete.

Why should I use Kubernetes Persistent Volumes instead of Volumes

To use storage inside Kubernetes PODs I can use volumes and persistent volumes. While the volumes like emptyDir are ephemeral, I could use hostPath and many other cloud based volume plugins which would provide a persistent solution in volumes itself.
In that case why should I be using Persistent Volume then?
It is very important to understand the main differences between Volumes and PersistentVolumes. Both Volumes and PersistentVolumes are Kubernetes resources which provides an abstraction of a data storage facility.
Volumes: let your pod write to a filesystem that exists as long as the pod exists. They also let you share data between containers in the same pod but data in that volume will be destroyed when the pod is restarted. Volume decouples the storage from the Container. Its lifecycle is coupled to a pod.
PersistentVolumes: serves as a long-term storage in your Kubernetes cluster. They exist beyond containers, pods, and nodes. A pod uses a persistent volume claim to to get read and write access to the persistent volume. PersistentVolume decouples the storage from the Pod. Its lifecycle is independent. It enables safe pod restarts and sharing data between pods.
When it comes to hostPath:
A hostPath volume mounts a file or directory from the host node's
filesystem into your Pod.
hostPath has its usage scenarios but in general it might not recommended due to several reasons:
Pods with identical configuration (such as created from a PodTemplate) may behave differently on different nodes due to different files on the nodes
The files or directories created on the underlying hosts are only writable by root. You either need to run your process as root in a privileged Container or modify the file permissions on the host to be able to write to a hostPath volume
You don't always directly control which node your pods will run on, so you're not guaranteed that the pod will actually be scheduled on the node that has the data volume.
If a node goes down you need the pod to be scheduled on other node where your locally provisioned volume will not be available.
The hostPath would be good if for example you would like to use it for log collector running in a DaemonSet.
I recommend the Kubernetes Volumes Guide as a nice supplement to this topic.
PersistentVoluemes is cluster-wide storage and allows you to manage the storage more centrally.
When you configure a volume (either using hostPath or any of the cloud-based volume plugins) then you need to do this configuration within the POD definition file. Every configuration information, required to configure storage for the volume, goes within the POD definition file.
When you have a large environment with a lot of users and a large number of PODs then users will have to configure storage every time for each POD they deploy. Whatever storage solution is used, the user who deploys the POD will have to configure that storage on all of his/her POD definition files. If a change needs to be made then the user will have to make this change on all of his/her PODs. After a certain scale, this is not the most optimal way to manage storage.
Instead, you would like to manage this centrally. You would like to manage the storage in such a way that an Administrator can create a large pool of storage and users can carve out a part of this storage as required, and this is exactly what you can do using PersistentVolumes and PersistentVolumeClaims.
Use PersistentVolumes when you need to set up a database like MongoDB, Redis, Postgres & MySQL. Because it's long-term storage and not deeply coupled with your pods! Perfect for database applications. Because they will not die with the pods.
Avoid Volumes when you need long-term storage. Because they will die with the pods!
In my case, when I have to store something, I will always go for persistent volumes!

Mounting a shared folder to all state full set replicas in k8

Context :
We have a Apache Nifi cluster deployed in Kubernetes as Stateful sets, and a volume claim template is used for Nifi repositories.
Nifi helm charts we are using
There is a use case where file processing is done by Nifi. So the file feeds are put into a shared folder and nifi would read it from the shared folder. When multiple nodes of Nifi is present all three would read from the shared folder.
In a non kubernetes environment we use NFS file share.
In AWS we use AWS S3 for storage and Nifi has processors to read from S3.
Problem :
Nifi is already deployed as a statefulset and use volume claim template for the storage repository. How can we mount this NFS share for file feed to all nifi replicas.
or in other word putting the question in a generic manner,
How can we mount a single NFS shared folder to all statefulset replicas ?
Solutions tried
We tried to link separate pvc claimed folders to the nfs share , but looks like a work around.
Can somebody please help. Any hints would be highly appreciated.
Put it in the pod template like normal. NFS is a "ReadWriteMany" volume type so you can create one PVC and then use it on every pod simultaneously. You can also configure NFS volumes directly in the pod data but using a PVC is probably better.
It sounds like what you have is correct :)

Kubernetes: strategy for out-of-cluster persistent storage

I need a piece of advice / recommendation / link to tutorial.
I am designing a kubernetes cluster and one of the projects is a Wordpress site with lots of pictures (photo blog).
I want to be able to tear down and re-create my cluster within an hour, so all "persistent" pieces need to be hosted outside of cluster (say, separate linux instance).
It is doable with database - I will just have a MySQL server running on that machine and will update WP configs accordingly.
It is not trivial with filesystem storage. I am looking at Kubernetes volume providers, specifically NFS. I want to setup NFS server on a separate machine and have each WP pod use that NFS share through volume mechanism. In that case, I can rebuild my cluster any time and data will be preserved. Almost like database access, but filesystem.
The questions are the following. Does this solution seem feasible? Is there any better way to achieve same goal? Does Kubernetes NFS plugin support the functionality I need? What about authorization?
so I am using a very similar strategy for my cluster where all my PVC are placed on a standalone VM instance with a static IP and which has an NFS-server running and a simple nfs-client-provisioner helm chart on my cluster.
So here what I did :
Created a server(Ubuntu) and installed the NFS server on it. Reference here
Install a helm chart/app for nfs-client-proviosner with parameters.
nfs.path = /srv ( the path on server which is allocated to NFS and shared)
nfs.server = xx.yy.zz.ww ( IP of my NFS server configured above)
And that's it the chart creates an nfs-client storage class which you can use to create a PVC and attach to your pods.
Note - Make sure to configure the /etc/exports file on the NFS server to look like this as mentioned in the reference digital ocean document.
/srv kubernetes_node_1_IP(rw,sync,no_root_squash,no_subtree_check)
/srv kubernetes_node_2_IP(rw,sync,no_root_squash,no_subtree_check)
... and so on.
I am using the PVC for some php and laravel applications, seem to work well without any considerable delays. Although you will have to check for your specific requirements. HTH.