how to use dynamic persistent volume provisioning for multitenancy environment - kubernetes

I developed a web application for our students and i would like to run this now in a kubernetes container environment. Every user (could be seen as tenant) gets its own application environment (1:1 relation).
the application environment consists of 2 pods (1x webserver, 1x database), defined by a deployment and a service.
I am using kubernetes v1.17.2 and i would like to use the feature of dynamic PersistentVolumeClaims together with the possibility to keep data of a specific user (tenant) between the deletion and re-creation of a new pod (e.g. case of updating to a new application version or after a hardware reboot).
I thought about using a environment variable at pod-creation (e.g. user-1, user-2, user-x,...) and using this information to allow a reusing of a dynamic created PersistentVolume.
is there any best-practise or concept how this can be achieved?
best regards
shane

The outcome that you wish to achieve will be strongly connected to the solution that you are currently using.
It will differ between Kubernetes instances that are provisioned in cloud (for example GKE) and Kubernetes instances on premises (for example: kubeadm, kubespray).
Talking about the possibility to retain user data please refer to official documentation: Kubernetes.io: Persistent volumes reclaiming. It shows a way to retain data inside a pvc.
Be aware of that local static provisioner does not support dynamic provisioning.
The local volume static provisioner manages the PersistentVolume lifecycle for pre-allocated disks by detecting and creating PVs for each local disk on the host, and cleaning up the disks when released. It does not support dynamic provisioning.
Github.com: Storage local static provisioner
Contrary to that VMware Vsphere supports dynamic provisioning. If you are using this solution please refer to this documentation
In your question there is a lack of specific explanation of users in your environment. Are they inside your application or are they outside? Is the application authenticating users? One of solution will be to create users inside of Kubernetes by service accounts and limit their view to namespace specifically created for them.
For service account creation please refer to: Kubernetes.io: Configure service account.
Additionally you could also look on Statefulsets.

Related

Dynamic storages in Kubernetes

I have an application running on Kubernetes that needs to access SMB shares that are configured dynamically (host, credentials, etc) within said application. I am struggling to achieve this (cleanly) with Kubernetes.
I am facing several difficulties:
I do not want "a" storage, I want explicitly specified SMB shares
These shares are dynamically defined within the application and not known beforehand
I have a variable amount of shares and a single pod needs to be able to access all of them
We currently have a solution where, on each kubernetes worker node, all shares are mounted to mountpoints in a common folder. This folder is then given as HostPath volume to the containers that need access to those storages. Finally, each of those containers has a logic to access the subfolder(s) matching the storage(s) he needs.
The downside, and the reason why I'm looking for a cleaner alternative, is:
HostPath volumes present security risks
For this solution, I need something outside Kubernetes that mounts the SMB shares automatically on each Kubernetes node
Is there a better solution that I am missing?
The Kubernetes object that seems to match this approach the most closely is the Projected Volume, since it "maps existing volume sources into the same directory". However, it doesn't support the type of volume source I need and I don't think it is possible to add/remove volume sources dynamically without restarting the pods that use this Projected Volume.
For sure your current solution using HostPath on the nodes is not flexible, not secure thus it is not a good practice.
I think you should consider using one of the custom drivers for your SMB shares:
CIFS FlexVolume Plugin - older solution, not maintained
SMB CSI Driver - actively developed (recommended)
CIFS FlexVolume Plugin:
This solution is older and it is replaced by a CSI Driver. The advantage compared to CSI is that you can specify SMB shares directly from the pod definition (including credentials as Kubernetes secret) as you prefer.
Here you can find instructions on how to install this plugin on your cluster.
SMB CSI Driver:
This driver will automatically take care of mounting SMB shares on all nodes by using DaemonSet.
You can install SMB CSI Driver either by bash script or by using a helm chart.
Assuming you have your SMB server ready, you can use one of the following solution to access it from your pod:
Storage class
PV/PVC
In both cases you have to use a previously created secret with the credentials.
In your case, for every SMB share you should create a Storage class / PV and mount it to the pod.
The advantage of CSI Driver is that it is newer, currently maintained solution and it replaced FlexVolume.
Below is diagram representing how CSI plugin operates:
Also check:
Kubernetes volume plugins evolution from FlexVolume to CSI
Introducing Container Storage Interface (CSI) Alpha for Kubernetes

Kubernetes cluster Mysql Nodes Storage

We have started setting up a Kubernetes cluster. On Production, we have 4 Mysql Nodes(2 Active Master, 2 Active slaves). Complete servers are on-premise, There is NO cloud providers usage.
Now how do I configure storage? I mean should I use PV / PVC? How will it work. Should I use local PV? Can someone explain to me this?
You need to use PersistentVolumes and PersistentVolumeClaims in order to achieve that.
A PersistentVolume (PV) is a piece of storage in the cluster that has
been provisioned by an administrator or dynamically provisioned using
Storage Classes.
A PersistentVolumeClaim (PVC) is a request for storage by a user.
Claims can request specific size and access modes (e.g., they can be
mounted once read/write or many times read-only).
Containers are ephemeral. When the container is restarted all the changes made prior to it are lost. Databases, however expect the data is persistent, therefore you need persistent volumes. You have to create a storage claim and the pod must be configured to mount the claimed storage.
Here you will find a simple guide showing how to deploy MySQL with a PersistentVolume. However, I strongly recommend getting familiar with the official docs that I have linked in order to fully understand the concept and adjust the access mode, class, size, etc according to your needs.
Please let me know if that helped.

Share large block device among multiple Kubernetes pods via NFS keeping exports isolated per namespace

I host multiple projects on a Kubernetes cluster. Disk usage for media files is growing fast. My hosting provider allows me to create large block storage spaces, but these spaces can only be attached to a node (VPS) as a block device. For now I don’t consider switching to an object storage.
I want to use a cheap small VPS with a large block device attached to it as a NFS server for several projects (pods).
I've read some tutorials about using NFS as persistent volumes. The approaches are:
External NFS service. What about security? How to expose an export to one and only one pod inside the cluster?
ie, on the NFS server machine:
/share/
project1/
project2/
...
projectN/
Where each /share/project{i} must be only available to pods in project{i} namespace.
Multiple dockerized NFS services, using the affinity value to attach the nfs services to nfs server node.
I don't know if it's a good practice having many NFS server pods on the same node.
Maybe there are other approaches I'm not aware. What's the best Kubernetes approach for this use case?
There is no 1 answer for your questions.
It depends on your solution(architecture),requirements,security many others factors.
External NFS service. What about security?
In this case all consideration are on your side (my advice is to choose some supported solution by your cloud provided) please refer to Considerations when choosing a right solution.
As one example please read about security NFS Volume Security. In your case all responsibility are on administrator side to share volumes and provide appropriate security settings.
According to the second question.
You can use pv,pvc claim, namespaces and storage classes to achieve your goals.
Please refer to pv with nfs server and storage classes
Note:
For example, NFS doesn’t provide an internal provisioner, but an external provisioner can be used. Some external provisioners are listed under the repository kubernetes-incubator/external-storage. There are also cases when 3rd party storage vendors provide their own external provisioner .
For affinity rules please also refer to Allowed Topologies in case topology of provisioned volumes will be applied/restricted to specific zones.
Additional resources:
Kubernetes NFS-Client Provisioner
NFS Server Provisioner
Hope this help.

Is Kubernetes' ETCD exposed for us to use?

We are working on provisioning our service using Kubernetes and the service needs to register/unregister some data for scaling purposes. Let's say the service handles long-held transactions so when it starts/scales out, it needs to store the starting and ending transaction ids somewhere. When it scales out further, it will need to find the next transaction id and save it with the ending transaction id that is covered. When it scales in, it needs to delete the transaction ids, etc. ETCD seems to make the cut as it is used (by Kubernetes) to store deployment data and not only that it is close to Kubernetes, it is actually inside and maintained by Kubernetes; thus we'd like to find out if that is open for our use. I'd like to ask the question for both EKS, AKS, and self-installed. Any advice welcome. Thanks.
Do not use the kubernetes etcd directly for an application.
Access to read/write data to the kubernetes etcd store is root access to every node in your cluster. Even if you are well versed in etcd v3's role based security model avoid sharing that specific etcd instance so you don't increase your clusters attack surface.
For EKS and GKE, the etcd cluster is hidden in the provided cluster service so you can't break things. I would assume AKS takes a similar approach unless they expose the instances to you that run the management nodes.
If the data is small and not heavily updated, you might be able to reuse the kubernetes etcd store via the kubernetes API. Create a ConfigMap or a custom resource definition for your data and edit it via the easily securable and namespaced functionality in the kubernetes API.
For most application uses run your own etcd cluster (or whatever service) to keep Kubernetes free to do it's workload scheduling. The coreos etcd operator will let you define and create new etcd clusters easily.

Role of openstack running kubernetes

I want to understand the role of openstack when kubernetes is deployed on top of it. Will the user be able to access the underlying openstack layer in this case? (I mean to ask if user can create instances, networks and access any other openstack resource)Or will the user be only provided with Kubernetes offerings? Any link or answer would help.
I don't seem to find the functionality part mentioned in any guide.
Openstack's role in the k8s world is to provide k8s with instances and storage to do it's job, just like GCE and Azure.
Kubernetes tries to abstract underlying cloud infrastructure so applications can be ported from one cloud provider to another transparently.
k8s achieves this by defining abstractions like persistent volumes and persistent volume claims allowing a pod to define a requirement for storage without needing to state it requires a cinder volume directly.
There should be no need to access openstack directly from your kubernetes-based app unless you app needs to actually manage an openstack cluster in which case you can provide your openstack credentials to your app and access the openstack api.