Opensource Storage Options for Kubernetes Cluster running on bare metal - kubernetes

I have a 3 node cluster running on bare metal. it was setup using kubeadm.
Each node in the cluster has 100GB disk space, total add up to 300GB.
I would like to utilize the 300GB disk space available on them to run stateful pods like mysql, postgresql, mongodb, cassandra etc. what are the different opensource options available to create persistent volumes.
I still havent used kuberentes v1.14 which offers local persistent volumes out of the box. It would be one of the option
second option is to run NFS server on each node and utilize the nfs share from the respective machine to create PV's.
Apart from these what other options can be looked at. please suggest

Related

How to have data in a database with FastAPI persist across multiple nodes?

If I use the https://github.com/tiangolo/full-stack-fastapi-postgresql project generator, how would one be able to persist data across multiple nodes (either with docker swarm or kubernetes)?
As I understand it, any postgresql data in a volumes directory would be different for every node (e.g. every digitalocean droplet). In this case, a user may ask for their data, get directed by traefik to a node with a different volumes directory, and return different information to the case where they may have been directed to another node. Is this correct?
If so, what would be the best approach to have multiple servers running a database work together and have the same data in the database?
On kubernetes, persistent volumes are used to associate storage that is mounted onto pods wherever they are loaded in the cluster and they are managed by providing the cluster with storage classes that map to drivers that map to some kind of SAN storage.
Docker / Docker swarm has similar support for docker volume plugins, but with the ascendancy of K8s there are virtually no active open source projects, and most of the prior commercial SAN driver vendors have migrated to K8s instead.
Nonetheless, depending on your tolerance, you can use a mix of direct nfs / fuse mounts, there are some not entirely abandoned docker volume drivers available in the nfs / glusterfs space.
This issue moby/moby #39624 addresses CSI support that we will hopefully see drop in 2021 that will bring swarm back inline with k8s.

minio for mariadb in kubernetes

I'm running a k3s single node cluster and have the k3s local-path-provisioner as storage. As I want to be able to add nodes in the future, I looked at minio to use on top of the local-path as storage. But I'm not sure if it's the right choice, cause I my workloads primarily use mariadb for data and I read, that an s3 compatible bucket isn't the best for database applications.
I hope you can help me figure this out.
If you don't want to use object storage then here are your options for running a local storage provisioner:
GlusterFS StorageClass
Doesn't have lot of documentation on how to set it up. But if you know your way around GlusterFS It'll be a good option.
local-path-provisioner
I
t provides a way for the Kubernetes users to utilize the local storage in each node
OpenEBS -> has a local volume storage engine but I think this is not designed to work on a shared volume mount and it end up tying a pod to a specific node since the data "doesn't exist" on the other nodes.
longhorn [recommened]
It creates a dedicated storage controller for each block device volume and synchronously replicates the volume across multiple replicas stored on multiple nodes.
rook
Rook is a storage operators for Kubernetes, It supports multiple storage backends. Don't use the NFS one tho cause we hit a wall when using it with our DBs.

HA postgresql on kubernetes

I wanted to deploy postgresql as database in my kubernetes cluster. As of now I've followed this tutorial.
By reading the whole thing I understood that we claimed a static storage before initiating the postgresql so that we have the data in case the pod fails. Also we can do replication by pointing to the same storage space to get our data back.
What happens if we use two workers nodes and the pods containing the database migrate to another node? I don’t think local storage will work.
hostPath volume is not recommended for production usage because of its ephemeral nature which means if the pod is rescheduled to another node the storage is not migrated and if the node reboots the data is lost.
For durable storage use external block or file storage systems mounted on the nodes using a supported CSI driver
For HA postgres I suggest you explore Postgres Operator which delivers an easy to run highly-available PostgreSQL clusters on Kubernetes (K8s) powered by Patroni. It is configured only through Postgres manifests (CRDs) to ease integration into automated CI/CD pipelines with no access to Kubernetes API directly, promoting infrastructure as code vs manual operations

Is pre disk addition is a must before deploying OpenEBS?

I have a 3 node k8s cluster and having a remote storage box with additional disks connected to it. I want to utilize these disks. So is this use case supported on OpenEBS? Also, do I have to attach the disks to Node before deploying OpenEBS? Is this a prerequisites?
Sure. It's supported and you need the disk attached when you setup OpenEBS as your block storage.
After you set it up, essentially you can create volumes (pvcs, pvs) for Kubernetes and mount them on your pods for consumption.
You can setup OpenEBS on Kubernetes cluster where you run your workloads either using helm or kubectl
Yes OpenEBS support storage with additional disks connected. With 0.7 it has a feature NDM (Node Disk Manager) which would monitor the disks attached to the nodes. Once the disks are attached you can create a pool on top of it and use the same. For more details, document link

Kubernetes - Persistent storage for PostgreSQL

We currently have a 2-node Kubernetes environment running on bare-metal machines (no GCE) and now we wish to set up a PostgreSQL instance on top of this.
Our plan was to map a data volume for the PostgreSQL Data Directory to the node using the volumeMounts option in Kubernetes. However this would be a problem because if the Pod ever gets stopped, Kubernetes will re-launch it at random on one of the other nodes. Thus we have no guarantee that it will use the correct data directory on re-launch...
So what is the best approach for maintaining a consistent and persistent PostgreSQL Data Directory across a Kubernetes cluster?
one solution is to deploy HA postgresql, for example https://github.com/sorintlab/stolon
another is to have some network storage attached to all nodes(NFS, glusterFS) and use volumeMounts in the pods