How can I use GCP NFS Filestore on k8 cluster with TPUs? - google-cloud-storage

I'm using GKE to run K8 workloads and want to add TPU support. From GCP docs, I "need" to attach a GCS bucket so the Job can read models and store logs. However, we already create shared NSF mounts for our k8 clusters. How hard of a requirement is it to "need" GCS to use TPUs? Can shared Filestore NFS mounts work just fine? What about using GCS Fuse?
I'm trying to avoid having the cluster user know about the back end file system (NFS vs GCS), and just know that that the files they provide will be available at "/home/job". Since the linked docs show passing a gs://mybucket/some/path value as needed for file system parameters, I'm not sure if a /home/job value will still work. Does the TPU access the filesystem directly and is only compatible with GCS? Or do the nodes access the filesystem (preferring GCS) and then share the data (in memory) with the TPUs?
I'll try it out to learn the hard way (and report back), but curious if others have experience with this already.

Related

What storage to use for passing data between pods?

I am working with kubernetes and I need to pass parquet files containing datasets between pods , but I don't know which option will work best.
As I know, persistent disk allows me to mount a shared volume on my pods, but with cloud storage I can share these files too.
All the process is hosted on google cloud.
If you want to persist the data you have to use the file store of Google. Which will support the read write many.
Persistent Volumes in GKE are supported using the Persistent Disks.
The problem with these disks is that they only support
ReadWriteOnce(RWO) (the volume can be mounted as read-write by a
single node) and ReadOnlyMany (ROX)(the volume can be mounted
read-only by many nodes) access modes.
Read more at : https://medium.com/#Sushil_Kumar/readwritemany-persistent-volumes-in-google-kubernetes-engine-a0b93e203180
With disk, it won't be possible to share the data between pods as it will only support the read-write once. The single disk will get attach to a single node.
If you looking forward to mounting the storage like a cloud bucket behind the POD using CSI driver, your file writing IO will be very slow. Storage can give better performance with API.
You can create the NFS server in Kubernetes and use also which will provide support again to read writ many.
Gluster FS & MinIo is one of the option to use, however if looking for managed NFS use the filestore of Google.
I would say go with the local persistent volume when you need to pass large amount of data sets which will be cost effective and efficient.
You should use Google Filestore as a file share. Then you need to:
create a Persistence Volume (PV)
create a Persistence Volume Claim (PVC)
Use the PVC with your pods
More details here

Is it Appropriate to Store Database in a Kubernetes Persistent Volume (And how to back up?)

I have a web application running on a Google Kubernetes cluster. My web app also uses persistent volumes for multiple MongoDB databases to store user and application data.
(1) Thus I am wondering if it is practical to store all data inside those persistent volumes in the long-run?
(2) Are there any methods for safely backing up the persistent volumes e.g. on a weekly basis (automatically)?
(3) I am also planning to integrate some kind of file upload into the application. Are persistent volumes capable of storing many GB/TB of data, or should I opt for something like Google cloud storage in this case?
Deploying statefull apps on K8s is bit painfull which is well known in K8s community. Usually, if we need HA for DBs supposed to deploy as cluster mode. But in K8s, if you want to deploy in cluster mode, you need to check StatefulSets concept. Anyways, I'm pasting links for your questions, so that you can start from there.
(1) Thus I am wondering if it is practical to store all data inside
those persistent volumes in the long-run?
Running MongoDB on Kubernetes with StatefulSets
(2) Are there any methods for safely backing up the persistent volumes
e.g. on a weekly basis (automatically)?
Persistent Volume Snapshots
Volume Snapshot (Beta from K8s docs)
You can google even more docs.
(3) I am also planning to integrate some kind of file upload into the
application. Are persistent volumes capable of storing many GB/TB of
data, or should I opt for something like Google cloud storage in this
case?
Not sure, it can hold TBs!?? but definitely, if you have cloud, consider to use it
Yes you can use the PVC in Kubernetes to store the data. However it's depends on your application usecase and size.
In kubernetes you can deploy Mongo DB as cluster and run it which is storing data inside PVC.MongoDB helm chart available for HA you can also look for that.
Helm chart : https://github.com/helm/charts/tree/master/stable/mongodb
It's suggested to single pod or statefulset of MongoDB on Kubernetes.
Backup:
For backup of MongoDB database, you can choose taking a snapshot of disk storage (PVC) weekly however along with that you can alos use Mongo snapshot.
Most people choose to manage service but still, it depends on your organization also.
Backup method
MongoDB snapshot
Disk storage snapshot
Filesystem :
Yes it can handle TB of data as it's ultimately disk volume or file
system.
Yes you can use PVC as file system but later in future you may get issue for scaling as PVC is ReadWriteOnce if you want to scale application along with PVC you have to implement ReadWriteMany.
There is sevral method also to achive this you can also directly mount file system to pod like AWS EFS but you may find it slow for file operations.
For file system there are various options available in Kubernetes like csi driver, gluster FS, minio, EFS.

Does Kubernetes support persistent volumes shared between multiple nodes in a cluster?

I need to build an application that has many bare-metal nodes joined in a Kubernetes cluster and I need a shared persistent file system between those nodes. The nodes should be able to read-write in this file system simultaneously.
Bonus: is there a way to keep the file system alive even if the cluster crashes?
I read this article but cant find answers to this question.
This problem is very important to me because it is a requirement to my undergraduate paper.
Yes it does. What you're looking for is to set your AccessMode to ReadWriteMany.
Note that not all Volume Plugins provide ReadWriteMany.
Multiple pods might be reading/writing to the Volume plugin at the same time. If a node/pod were to restart, you would still have access to the volume.
To get a full list of what which Volume Plugin supports that, refer to the official documentation.

How to manage file uploads with GKE?

I'm trying to run an api (based on Symfony) with kubernetes thanks to Google Container Engine (GKE).
This API also allow user to store and download files, which are supposed to be saved somewhere.
I tried to run it with 1 replica, and noticed a downtime of the service during the creation of the new container. It looks like at least 2 replicas are needed to avoid downtime.
Taking that into consideration, I'm interested about these options :
A volume based on Google Persistent Disk. Would this mean that all my replicas would be on the same node ? (ReadWriteOnce access mode). If so, in case of a node failure, my service would not be available.
A volume based on Flocker (Backend Persistent Disk). What is the recommended way to install it on GKE ?
Is there another interesting option ? What would you recommend ?
Using GCS (like tex mentioned) is probably the simplest solution (and will be very fast from a GKE cluster). Here is an answer that may help.
If you have a specific need for local persistent storage, you can use Google Persistent Disks, but they can only be mounted as writable in one place.
Petsets (currently alpha) will provide better support for distributed persistent in-cluster storage, so you can also look into that if GCS doesn't work for you.

How can I use Google Cloud Storage in a container deployed to the Google Container Engine?

Background
I have a Java-Servlet application that runs in tomcat, which runs in a docker container, which runs on the Google Container Engine. It is no big deal to extend the docker image so that it also fetches and refreshes the certificates (there is only a single pod per domain, so no inter-pod-communication is required). However certbot needs to save it's credentials and certificates somewhere and the pod's filesystem seems like a bad idea because it is ephemeral and won't survive a pod restart. According to the table with storage options. Google Cloud storage seems like a good idea, because it is very cheap, the volume is auto sized and I can also access it from multiple locations (I don't need to create one disk for each individual pod which will be pretty much empty) including the web-UI (the later may be useful for debugging) and throuput and latency are really no issue for this usecase.
Question
I created a bucket and now I want to access that bucket from a container. Google describes here and yet again here that I can mount the buckets using FUSE. What they don't mention is that you need to make the container privileged to use FUSE which does not feel quite right for me. Additionally I need to install the whole google cloud SDK and set up authentication (which I am going to store... where?). But actually I don't really need fuse access. Just downloading the config on startup and uploading the config after each refresh would be enough. So something that works similar to SCP would do...
There is gcloud which can access files from command line without the need for FUSE, but it still needs to be initialized somehow with credentials.
Here user326502 mentions
It won't work with zero configuration if the App Engine SDK is installed [..] As long as the container lives on a Google Compute Engine instance you can access any bucket in the same project.
He explains further that I magically don't need any credentials when I just use the library. I guess I could write my own copy application with those libraries, but it feels like the fact that I did not find something like this from anyone on the net makes me feel that I am completely on the wrong track.
So how would one actually access a google cloud storage bucket from within a container (as simple as possible)?
You can use gsutil to copy from the bucket to the local disk when the container starts up.
If you are running in Google Container Engine, gsutil will use the service account of the cluster's nodes (to do this, you'll need to specify the storage-ro scope when you create your cluster).
Alternatively, you can create a new service account, generating a JSON key. In Container Engine, you can store that key as a Kubernetes secret, and then mount the secret in the pod that needs to use it. From that pod, you'd configure gsutil to use the service account by calling gcloud auth activate-service-account--key-file /path/to/my/mounted/secret-key.json