Mounting GCE Persistent Disk on my local machine - kubernetes

I am trying to mount a GCE persistent disk that was created by a Kubernetes PersistentVolumeClaim resource (on GKE) to my local machine.
I created a PersistentVolumeClaim (that creates a persistent volume in GCE):
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: profiler-disk
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
I tried to use gcsfuse to mount the disk as is written in the documentation:
You can use the Google Cloud Storage FUSE tool to mount a Cloud Storage bucket to your Compute Engine instance. The mounted bucket behaves similarly to a persistent disk even though Cloud Storage buckets are object storage.
with the command:
gcsfuse profiler-disk hello
but I am getting:
daemonize.Run: readFromProcess: sub-process: mountWithArgs: mountWithConn: setUpBucket: OpenBucket: Unknown bucket "profiler-disk"
I was able to load an actual bucket, so this is not an authorization/authentication issue.
Does anyone know how to achieve this?

I was able to copy the data using kubectl cp
kubectl <pod-name>:/path <local-path> -c <container-name>

Related

Initializing a dynamically provisioned shared volume with ReadOnlyMany access mode

My GKE deployment consists of N pods (possibly on different nodes) and a shared volume, which is dynamically provisioned by pd.csi.storage.gke.io and is a Persistent Disk in GCP. I need to initialize this disk with data before the pods go live.
My problem is I need to set accessModes to ReadOnlyMany and be able to mount it to all pods across different nodes in read-only mode, which I assume effectively would make it impossible to mount it in write mode to the initContainer.
Is there a solution to this issue? Answer to this question suggests a good solution for a case when each pod has their own disk mounted, but I need to have one disk shared among all pods since my data is quite large.
With GKE 1.21 and later, you can enable the managed Filestore CSI driver in your clusters. You can enable the driver for new clusters
gcloud container clusters create CLUSTER_NAME \
--addons=GcpFilestoreCsiDriver ...
or update existing clusters:
gcloud container clusters update CLUSTER_NAME \
--update-addons=GcpFilestoreCsiDriver=ENABLED
Once you've done that, create a storage class (or have or platform admin do it):
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: filestore-example
provisioner: filestore.csi.storage.gke.io
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
parameters:
tier: standard
network: default
After that, you can use PVCs and dynamic provisioning:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: podpvc
spec:
accessModes:
- ReadWriteMany
storageClassName: filestore-example
resources:
requests:
storage: 1Ti
...I need to have one disk shared among all pods
You can try Filestore. First your create a FileStore instance and save your data on a FileStore volume. Then you install FileStore driver on your cluster. Finally you share the data with pods that needs to read the data using a PersistentVolume referring the FileStore instance and volume above.

kubernetes PersistentVolume over google cloud storage bucket

I have created a persistent volume claim where I will store some ml model weights as follows:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: models-storage
spec:
accessModes:
- ReadWriteOnce
storageClassName: model-storage-bucket
resources:
requests:
storage: 8Gi
However this configuration will provision a disk on a compute engine, and it is a bit cumbersome to copy stuff there and to upload/update any data. Would be so much more convenient if I could create a PersistentVolume abstracting a google cloud storage bucket. However, I couldn't find anywhere including the google documentation a way to do this. I am baffled because I would expect this to be a very common use case. Anyone knows how I can do that?
I was expecting to find something along the lines of
apiVersion: v1
kind: PersistentVolume
metadata:
name: test-volume
spec:
storageBucketPersistentDisk:
pdName: gs://my-gs-bucket
To mount cloud storage bucket you need to install Google Cloud Storage driver (NOT the persistent disk nor file store) on your cluster, create the StorageClass and then provision the bucket backed storage either dynamically or static; just as you would like using persistent disk or file store csi driver. Checkout the link for detailed steps.

Azure Kubernetes Persistant Volume Azure Disk

I am using Azure Kubernetes and I have created Persistent Volume, Claims and Storage Class.
I want to deploy pods on the Persistent Volume so we can increase the Volume anytime as per the requirement. Right now our Pods are deployed in the Virtual Machines OS Disk. Since we are using the default Pods deployment on the VM Disk when we run out of the disk space the whole cluster will be destroyed and created again.
Please let me know how can I configure Pods to deploy in Azure (Managed) Disk.
Thanks,
Mrugesh
You don't have to create a Persistent Volume manually, if you want Azure Disk, this can be created dynamically for you.
From Azure Built-in storage classes:
The default storage class provisions a standard SSD Azure disk.
Standard storage is backed by Standard SSDs and delivers cost-effective storage while still delivering reliable performance.
You only have to create the PersistentVolumeClaim with the storage class you want to use, e.g.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: azure-managed-disk
spec:
accessModes:
- ReadWriteOnce
storageClassName: default
resources:
requests:
storage: 5Gi
and then refer to that PVC in your Deployment or Pods.

Kubernetes pod scheduling with replication and local persistent volumes?

Hi i have this problem in kubernetes. I install a deployment with helm that consists of 2 pods i have to put them in two nodes but i want that the persistent volume used by the pods are in the same nodes as the pod are deployed. Is feasible in helm? Thanks
I think you can use Local Persistent Volume
See more: local-pv, local-pv-comparision.
Usage of Local Persistent Volumes:
The local volumes must still first be set up and mounted on the local
node by an administrator. The administrator needs to mount the local
volume into a configurable “discovery directory” that the local volume
manager recognizes. Directories on a shared file system are supported,
but they must be bind-mounted into the discovery directory.
This local volume manager monitors the discovery directory, looking
for any new mount points. The manager creates a PersistentVolume
object with the appropriate storageClassName, path, nodeAffinity, and
capacity for any new mount point that it detects. These
PersistentVolume objects can eventually be claimed by
PersistentVolumeClaims, and then mounted in Pods.
After a Pod is done using the volume and deletes the
PersistentVolumeClaim for it, the local volume manager cleans up the
local mount by deleting all files from it, then deleting the
PersistentVolume object. This triggers the discovery cycle: a new
PersistentVolume is created for the volume and can be reused by a new
PersistentVolumeClaim.
Local volume can be requested in exactly the same way as any other
PersistentVolume type: through a PersistentVolumeClaim. Just specify
the appropriate StorageClassName for local volumes in the
PersistentVolumeClaim object, and the system takes care of the rest!
In your case I will create manually storageclass and use it during chart installation.
apiVersion: v1
kind: PersistentVolume
metadata:
name: example-local-pv
spec:
capacity:
storage: 50Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: example-local-storage
local:
path: /mnt/disks/v1
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: example-local-storage
provisioner: kubernetes.io/no-provisioner
and override default PVC storageClassName configuration during helm install like this:
$ helm install --name chart-name --set persistence.storageClass=example-local-storage
Take a look: using-local-pv, pod-local-pv, kubernetes-1.19-lv.

Kubernetes PVC with ReadWriteMany on AWS

I want to setup a PVC on AWS, where I need ReadWriteMany as access mode. Unfortunately, EBS only supports ReadWriteOnce.
How could I solve this?
I have seen that there is a beta provider for AWS EFS which supports ReadWriteMany, but as said, this is still beta, and its installation looks somewhat flaky.
I could use node affinity to force all pods that rely on the EBS volume to a single node, and stay with ReadWriteOnce, but this limits scalability.
Are there any other ways of how to solve this? Basically, what I need is a way to store data in a persistent way to share it across pods that are independent of each other.
Using EFS without automatic provisioning
The EFS provisioner may be beta, but EFS itself is not. Since EFS volumes can be mounted via NFS, you can simply create a PersistentVolume with a NFS volume source manually -- assuming that automatic provisioning is not a hard requirement on your side:
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-efs-volume
spec:
capacity:
storage: 100Gi # Doesn't really matter, as EFS does not enforce it anyway
volumeMode: Filesystem
accessModes:
- ReadWriteMany
mountOptions:
- hard
- nfsvers=4.1
- rsize=1048576
- wsize=1048576
- timeo=600
- retrans=2
nfs:
path: /
server: fs-XXXXXXXX.efs.eu-central-1.amazonaws.com
You can then claim this volume using a PersistentVolumeClaim and use it in a Pod (or multiple Pods) as usual.
Alternative solutions
If automatic provisioning is a hard requirement for you, there are alternative solutions you might look at: There are several distributed filesystems that you can roll out on yourcluster that offer ReadWriteMany storage on top of Kubernetes and/or AWS. For example, you might take a look at Rook (which is basically a Kubernetes operator for Ceph). It's also officially still in a pre-release phase, but I've already worked with it a bit and it runs reasonably well.
There's also the GlusterFS operator, which already seems to have a few stable releases.
You can use Amazon EFS to create PersistentVolume with ReadWriteMany access mode.
Amazon EKS Announced support for the Amazon EFS CSI Driver on Sep 19 2019, which makes it simple to configure elastic file storage for both EKS and self-managed Kubernetes clusters running on AWS using standard Kubernetes interfaces.
Applications running in Kubernetes can
use EFS file systems to share data between pods in a scale-out group,
or with other applications running within or outside of Kubernetes.
EFS can also help Kubernetes applications be highly available because
all data written to EFS is written to multiple AWS Availability zones.
If a Kubernetes pod is terminated and relaunched, the CSI driver will
reconnect the EFS file system, even if the pod is relaunched in a
different AWS Availability Zone.
You can deploy the Amazon EFS CSI Driver to an Amazon EKS cluster following the EKS-EFS-CSI user guide, basically like this:
Step 1: Deploy the Amazon EFS CSI Driver
kubectl apply -k "github.com/kubernetes-sigs/aws-efs-csi-driver/deploy/kubernetes/overlays/stable/?ref=master"
Note: This command requires version 1.14 or greater of kubectl.
Step 2: Create an Amazon EFS file system for your Amazon EKS cluster
Step 2.1: Create a security group that allows inbound NFS traffic for your Amazon EFS mount points.
Step 2.2: Add a rule to your security group to allow inbound NFS traffic from your VPC CIDR range.
Step 2.3: Create the Amazon EFS file system configured with the security group you just created.
Now you are good to use EFS with ReadWriteMany access mode in your EKS Kubernetes project with the following sample manifest files:
1. efs-storage-class.yaml: Create the storage class
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: efs-sc
provisioner: efs.csi.aws.com
kubectl apply -f efs-storage-class.yaml
2. efs-pv.yaml: Create PersistentVolume
apiVersion: v1
kind: PersistentVolume
metadata:
name: ftp-efs-pv
spec:
storageClassName: efs-sc
persistentVolumeReclaimPolicy: Retain
capacity:
storage: 10Gi # Doesn't really matter, as EFS does not enforce it anyway
volumeMode: Filesystem
accessModes:
- ReadWriteMany
csi:
driver: efs.csi.aws.com
volumeHandle: fs-642da695
Note: you need to replace the volumeHandle value with your Amazon EFS file system ID.
3. efs-pvc.yaml: Create PersistentVolumeClaim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: ftp-pv-claim
labels:
app: ftp-storage-claim
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Gi
storageClassName: efs-sc
That should be it. You need to refer to the aforementioned official user guide for detailed explanation, where your can also find an example app to verify your setup.
As you mention EBS volume with affinity & node selector will stop scalability however with EBS only ReadWriteOnce will work.
Sharing my experience, if you are doing many operations on the file system and frequently pushing & fetching files it might could be slow with EFS which can degrade application performance. operation rate on EFS is slow.
However, you can use GlusterFs in back it will be provisioning EBS volume. GlusterFS also support ReadWriteMany and it will be faster compared to EFS as it's block storage (SSD).