Kubernetes: How to find installed storage provisioners - kubernetes

With Kubernetes on can define storage classes with provisioners. How does one find which provisioners are installed and available in the cluster?
Inspecting the storage classes will reveal which provisioners are already in use, but not whether there are more available.

A provisioner does not necessarily need to run in the cluster, e.g. the provisioner for an external storage appliance just connects to the cluster api server and watches for new persistent volume requests created with a storage class bound to its provisioner name. This is why as of Kubernetes 1.7 there is no intended universal way to see if a storage classes provisioner is actually available or not.

Related

How to make the Kubernetes volumes cloud agnostic?

Currently, my Kubernetes cluster is provisioned via GKE.
I use GCE Persistent Disks to persist my data.
In GCE, persistent storage is provided via GCE Persistent Disks. Kubernetes supports adding them to Pods or PersistenVolumes or StorageClasses via the gcePersistentDisk volume/provisioner type.
What if I would like to transfer my cluster from Google to, lets say, Azure or AWS?
Then I would have to change value of volume type to azureFile or awsElasticBlockStore respectively in all occurrences in the manifest files.
I hope CSI driver will solve that problem, unfortunately, they also use a different type of volume for each provider cloud provider, for example pd.csi.storage.gke.io for GCP or disk.csi.azure.com for Azure.
Is there any convenient way to make the Kubernetes volumes to be cloud agnostic? In which I wouldn't have to make any changes in manifest files before K8s cluster migration.
You cannot have cloud agnostic storage by using the CSI drivers or the native VolumeClaims in Kubernetes. That's because these API's are the upstream way of provisioning storage which each cloud provider has to integrate with to translate them into the Cloud Specific API (PD for Google, EBS for AWS...)
Unless you have a self-managed Storage that you can access via an NFS driver or a specific driver from the tools managed above. And still with that the Self-Managed Storage solution is going to be based on a Cloud provider specific volume. So You are just going to shift the issue to a different place.

Dynamic storages in Kubernetes

I have an application running on Kubernetes that needs to access SMB shares that are configured dynamically (host, credentials, etc) within said application. I am struggling to achieve this (cleanly) with Kubernetes.
I am facing several difficulties:
I do not want "a" storage, I want explicitly specified SMB shares
These shares are dynamically defined within the application and not known beforehand
I have a variable amount of shares and a single pod needs to be able to access all of them
We currently have a solution where, on each kubernetes worker node, all shares are mounted to mountpoints in a common folder. This folder is then given as HostPath volume to the containers that need access to those storages. Finally, each of those containers has a logic to access the subfolder(s) matching the storage(s) he needs.
The downside, and the reason why I'm looking for a cleaner alternative, is:
HostPath volumes present security risks
For this solution, I need something outside Kubernetes that mounts the SMB shares automatically on each Kubernetes node
Is there a better solution that I am missing?
The Kubernetes object that seems to match this approach the most closely is the Projected Volume, since it "maps existing volume sources into the same directory". However, it doesn't support the type of volume source I need and I don't think it is possible to add/remove volume sources dynamically without restarting the pods that use this Projected Volume.
For sure your current solution using HostPath on the nodes is not flexible, not secure thus it is not a good practice.
I think you should consider using one of the custom drivers for your SMB shares:
CIFS FlexVolume Plugin - older solution, not maintained
SMB CSI Driver - actively developed (recommended)
CIFS FlexVolume Plugin:
This solution is older and it is replaced by a CSI Driver. The advantage compared to CSI is that you can specify SMB shares directly from the pod definition (including credentials as Kubernetes secret) as you prefer.
Here you can find instructions on how to install this plugin on your cluster.
SMB CSI Driver:
This driver will automatically take care of mounting SMB shares on all nodes by using DaemonSet.
You can install SMB CSI Driver either by bash script or by using a helm chart.
Assuming you have your SMB server ready, you can use one of the following solution to access it from your pod:
Storage class
PV/PVC
In both cases you have to use a previously created secret with the credentials.
In your case, for every SMB share you should create a Storage class / PV and mount it to the pod.
The advantage of CSI Driver is that it is newer, currently maintained solution and it replaced FlexVolume.
Below is diagram representing how CSI plugin operates:
Also check:
Kubernetes volume plugins evolution from FlexVolume to CSI
Introducing Container Storage Interface (CSI) Alpha for Kubernetes

Is Kubernetes local/csi PV content synced into a new node?

According to the documentation:
A PersistentVolume (PV) is a piece of storage in the cluster that has been provisioned ... It is a resource in the cluster just like a node is a cluster resource...
So I was reading about all currently available plugins for PVs and I understand that for 3rd-party / out-of-cluster storage this doesn't matter (e.g. storing data in EBS, Azure or GCE disks) because there are no or very little implications when adding or removing nodes from a cluster. However, there are different ones such as (ignoring hostPath as that works only for single-node clusters):
csi
local
which (at least from what I've read in the docs) don't require 3rd-party vendors/software.
But also:
... local volumes are subject to the availability of the underlying node and are not suitable for all applications. If a node becomes unhealthy, then the local volume becomes inaccessible by the pod. The pod using this volume is unable to run. Applications using local volumes must be able to tolerate this reduced availability, as well as potential data loss, depending on the durability characteristics of the underlying disk.
The local PersistentVolume requires manual cleanup and deletion by the user if the external static provisioner is not used to manage the volume lifecycle.
Use-case
Let's say I have a single-node cluster with a single local PV and I want to add a new node to the cluster, so I have 2-node cluster (small numbers for simplicity).
Will the data from an already existing local PV be 1:1 replicated into the new node as in having one PV with 2 nodes of redundancy or is it strictly bound to the existing node only?
If the already existing PV can't be adjusted from 1 to 2 nodes, can a new PV (created from scratch) be created so it's 1:1 replicated between 2+ nodes on the cluster?
Alternatively if not, what would be the correct approach without using a 3rd-party out-of-cluster solution? Will using csi cause any change to the overall approach or is it the same with redundancy, just different "engine" under the hood?
Can a new PV be created so it's 1:1 replicated between 2+ nodes on the cluster?
None of the standard volume types are replicated at all. If you can use a volume type that supports ReadWriteMany access (most readily NFS) then multiple pods can use it simultaneously, but you would have to run the matching NFS server.
Of the volume types you reference:
hostPath is a directory on the node the pod happens to be running on. It's not a directory on any specific node, so if the pod gets recreated on a different node, it will refer to the same directory but on the new node, presumably with different content. Aside from basic test scenarios I'm not sure when a hostPath PersistentVolume would be useful.
local is a directory on a specific node, or at least following a node-affinity constraint. Kubernetes knows that not all storage can be mounted on every node, so this automatically constrains the pod to run on the node that has the directory (assuming the node still exists).
csi is an extremely generic extension mechanism, so that you can run storage drivers that aren't on the list you link to. There are some features that might be better supported by the CSI version of a storage backend than the in-tree version. (I'm familiar with AWS: the EBS CSI driver supports snapshots and resizing; the EFS CSI driver can dynamically provision NFS directories.)
In the specific case of a local test cluster (say, using kind) using a local volume will constrain pods to run on the node that has the data, which is more robust than using a hostPath volume. It won't replicate the data, though, so if the node with the data is deleted, the data goes away with it.

how to use dynamic persistent volume provisioning for multitenancy environment

I developed a web application for our students and i would like to run this now in a kubernetes container environment. Every user (could be seen as tenant) gets its own application environment (1:1 relation).
the application environment consists of 2 pods (1x webserver, 1x database), defined by a deployment and a service.
I am using kubernetes v1.17.2 and i would like to use the feature of dynamic PersistentVolumeClaims together with the possibility to keep data of a specific user (tenant) between the deletion and re-creation of a new pod (e.g. case of updating to a new application version or after a hardware reboot).
I thought about using a environment variable at pod-creation (e.g. user-1, user-2, user-x,...) and using this information to allow a reusing of a dynamic created PersistentVolume.
is there any best-practise or concept how this can be achieved?
best regards
shane
The outcome that you wish to achieve will be strongly connected to the solution that you are currently using.
It will differ between Kubernetes instances that are provisioned in cloud (for example GKE) and Kubernetes instances on premises (for example: kubeadm, kubespray).
Talking about the possibility to retain user data please refer to official documentation: Kubernetes.io: Persistent volumes reclaiming. It shows a way to retain data inside a pvc.
Be aware of that local static provisioner does not support dynamic provisioning.
The local volume static provisioner manages the PersistentVolume lifecycle for pre-allocated disks by detecting and creating PVs for each local disk on the host, and cleaning up the disks when released. It does not support dynamic provisioning.
Github.com: Storage local static provisioner
Contrary to that VMware Vsphere supports dynamic provisioning. If you are using this solution please refer to this documentation
In your question there is a lack of specific explanation of users in your environment. Are they inside your application or are they outside? Is the application authenticating users? One of solution will be to create users inside of Kubernetes by service accounts and limit their view to namespace specifically created for them.
For service account creation please refer to: Kubernetes.io: Configure service account.
Additionally you could also look on Statefulsets.

Share large block device among multiple Kubernetes pods via NFS keeping exports isolated per namespace

I host multiple projects on a Kubernetes cluster. Disk usage for media files is growing fast. My hosting provider allows me to create large block storage spaces, but these spaces can only be attached to a node (VPS) as a block device. For now I don’t consider switching to an object storage.
I want to use a cheap small VPS with a large block device attached to it as a NFS server for several projects (pods).
I've read some tutorials about using NFS as persistent volumes. The approaches are:
External NFS service. What about security? How to expose an export to one and only one pod inside the cluster?
ie, on the NFS server machine:
/share/
project1/
project2/
...
projectN/
Where each /share/project{i} must be only available to pods in project{i} namespace.
Multiple dockerized NFS services, using the affinity value to attach the nfs services to nfs server node.
I don't know if it's a good practice having many NFS server pods on the same node.
Maybe there are other approaches I'm not aware. What's the best Kubernetes approach for this use case?
There is no 1 answer for your questions.
It depends on your solution(architecture),requirements,security many others factors.
External NFS service. What about security?
In this case all consideration are on your side (my advice is to choose some supported solution by your cloud provided) please refer to Considerations when choosing a right solution.
As one example please read about security NFS Volume Security. In your case all responsibility are on administrator side to share volumes and provide appropriate security settings.
According to the second question.
You can use pv,pvc claim, namespaces and storage classes to achieve your goals.
Please refer to pv with nfs server and storage classes
Note:
For example, NFS doesn’t provide an internal provisioner, but an external provisioner can be used. Some external provisioners are listed under the repository kubernetes-incubator/external-storage. There are also cases when 3rd party storage vendors provide their own external provisioner .
For affinity rules please also refer to Allowed Topologies in case topology of provisioned volumes will be applied/restricted to specific zones.
Additional resources:
Kubernetes NFS-Client Provisioner
NFS Server Provisioner
Hope this help.