How to customize storage path for kubernetes pods using glusterfs - kubernetes

Glusterfs create a volume and set auth.allow for all kubernetes nodes.
Then I can use kubernetes endpoint to use glusterfs volume.
But,If I create too many rc or pods using glusterfs endpoint, the data is all in the same gluster volume path / .
I know I can create create more glusterfs volume and kubernetes endpoints for each rc or pods to use. But I don't think it's the best practice.

If you want different data sets, you need to create different volumes. I'm not sure what other answer you're looking for?

Related

Can kubernetes volumes be used for deployments? If so what happens if each pod is on different host?

Can we use kubernetes volumes for deployments? If yes than that will be mutliple pods sharing the same volume?
If that is possible then what happens when all the pods for the deployment are on different host machines?
Especially when using Amazon EBS where an ebs volume cannot be shared across multiple hosts.
Yes, you can use a persistent volume for deployments
Such a volume will be mounted to your desired location in all the pods
If you use EBS block storage, all your pods will need to be scheduled on the same node where you have attached your volume. This may not work if you have many replicas
You will have to use a network file storage, such as EFS, GlusterFS, Portworx, etc. with ReadWriteMany if you want your pods to be spun up on different nodes
EBS will give you the best performance with the aforementioned single node limitation

Kubernetes: hostPath Static Storage with PV vs hard coded hostPath in Pod Volume

I'm learning Kubernetes and there is something I don't get well.
There are 3 ways of setting up static storage:
Pods with volumes you attach diretctly the storage to
Pods with a PVC attached to its volume
StatefulSets with also PVC inside
I can understand the power of PVC when working together with StorageClass, but not when working with static storage and local storage like hostPath
To me, it sounds very similar:
In the first case I have a volume directly attached to a pod.
In the second case I have a volume statically attached to a PVC, which is also manually attached to a Pod. In the end, the volume will be statically attached to the Pod.
On both cases, the data will remain when the Pod is terminates and will be adopted by the next Pod which the corresponing definition, right?
The only profit I see from using PVCs over plain Pod is that you can define the acces mode. Apart of that. Is there a difference when working with hostpath?
On the other hand, the advantage of using a StatefulSet instead of a PVC is (if understood properly) that it get a headless service, and that the rollout and rollback mechanism works differently. Is that the point?
Thank you in advance!
Extracted from this blog:
The biggest difference is that the Kubernetes scheduler understands
which node a Local Persistent Volume belongs to. With HostPath
volumes, a pod referencing a HostPath volume may be moved by the
scheduler to a different node resulting in data loss. But with Local
Persistent Volumes, the Kubernetes scheduler ensures that a pod using
a Local Persistent Volume is always scheduled to the same node.
Using hostPath does not garantee that a pod will restart on the same node. So you pod can attach /tmp/storage on k8s-node-1, then if you delete and re-create the pod, it may attach tmp/storage on k8s-node-[2-n]
On the contrary, if you use PVC/PV with local persistent storage class, then if you delete and re-create a pod, it will stick on the node which handle the local persistent storage.
StatefulSet creates pods and has volumeClaimTemplate field, which creates a dedicated PVC for each pod. So each pod created by the statefulSet will have its own dedicated storage, linked with Pod->PVC->PV->Storage. So StatefulSet use also the PVC/PV mechanism.
More details are available here.

How to share persistent volume of a StatefulSet with another StatefulSet?

I have a StatefulSet-1 running with 3 replicas & each pod writing logs to its own persistent volume say pv1,pv2,pv3 (achieved using volumeClaimTemplates:)
I have another StatefulSet-2 running with 3 replicas & I want each POD of StatefulSet-2 access already created StatefulSet-1's volumes i.e. pv1,pv2 & pv3 for processing seperate logs written by each pod of StatefulSet-1.
So pv1,pv2,pv3 should be using by both StatefulSet1 & StatefulSet2 since pv1,pv2,pv3 created as part of StatefulSet-1 deployment! pv1,pv2,pv3 will ofcourse takes POD's name of StatefulSet-1 which is ok for StatefulSet-2.
How to configure StatefulSet2 to achieve the above scenario? please help!
Thanks & Regards,
Sudhir
This won't work.
1. PVs backed by GCE disks are in readWriteOnce mode so 1 pvc per pod.
2. You are achieving the statefulset pods with PVCs using PVC templates which rely on dynamic volume provisioning to create the appropriate PVs and PVCs.
If you need these pods to share the PVC, your best bet is to use a readWriteMany PV such as one backed by NFS. You will also need to create the pods of statefulSet-2 manually to have them mount the appropriate PVCs. You could achieve this by creating a single pod deployment for each one.
Something else to consider, can you have the containers of each statefulSet run together in the same pods? Normally this is not recommended, but it would allow them both to share the same volumes (as long as they are not using the same ports)

How does Kubernetes know on which node to schedule its POD when PVs are backed by logical volumes?

If i implement a CSI driver that will create logical volumes via lvcreate command, and give those volumes for Kubernetes to make PVs from, how will Kubernetes know the volume/node association so that it can schedule a POD which uses this PV on the node where my newly-created logical volume resides? Does it just automagically happen?
k8s Scheduler can be influenced using volume topology.
Here is the design proposal which walks through the whole dimension
Allow topology to be specified for both pre-provisioned and dynamic provisioned PersistentVolumes so that the Kubernetes scheduler can correctly place a Pod using such a volume to an appropriate node.
Volume Topology-aware Scheduling

Where I can find Kubernetes PV on the host filesystem?

I am trying to understand how Kubernetes handles the persistent volumes on the node's filesystem.
For example, if I have a minikube as my Kubernetes cluster node, and I create multiple PVs with PVC for may pods and if I ssh to minikube, where I can find the PV on minikube's filesystem?
If I type
lsblk
I get
sda 8:0 0 19.5G 0 disk
but no PV disks are listed.
Thank you for your answers.
You will not see it because it's inside API as an API Object.
I recommend reading Kubernetes documentation regarding Persistent Volumes.
A PersistentVolume (PV) is a piece of storage in the cluster that has been provisioned by an administrator. It is a resource in the cluster just like a node is a cluster resource. PVs are volume plugins like Volumes, but have a lifecycle independent of any individual pod that uses the PV. This API object captures the details of the implementation of the storage, be that NFS, iSCSI, or a cloud-provider-specific storage system.
A PersistentVolumeClaim (PVC) is a request for storage by a user. It is similar to a pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (e.g., can be mounted once read/write or many times read-only).
While PersistentVolumeClaims allow a user to consume abstract storage resources, it is common that users need PersistentVolumes with varying properties, such as performance, for different problems. Cluster administrators need to be able to offer a variety of PersistentVolumes that differ in more ways than just size and access modes, without exposing users to the details of how those volumes are implemented. For these needs there is the StorageClass resource.
Please see the detailed walkthrough with working examples.
You can also have a look at the Kubernetes Volumes Guide which explains the types of storage, how long do they last and how to use them in examples.
Because they are hostPath, you will not see them in lsblk. Use "kubectl describe pv PV_NAME" to understand where they are located.