MongoDB Community Kubernetes Operator and Custom Persistent Volumes - mongodb

I'm trying to deploy a MongoDB replica set by using the MongoDB Community Kubernetes Operator in Minikube.
I followed the instructions on the official GitHub, so:
Install the CRD
Install the necessary roles and role-bindings
Install the Operator Deploy the Replicaset
By default, the operator will creates three pods, each of them automatically linked to a new persistent volume claim bounded to a new persistent volume also created by the operator (so far so good).
However, I would like the data to be saved in a specific volume, mounted in a specific host path. So in order I would need to create three persistent volumes, each mounted to a specific host path, and then automatically I would want to configure the replicaset so that each pod would connect to its respective persistent volume (perhaps using the matchLabels selector).
So I created three volumes by applying the following file:
apiVersion: v1
kind: PersistentVolume
metadata:
name: mongodb-pv-00
namespace: $NAMESPACE
labels:
type: local
service: mongo
spec:
storageClassName: manual
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/mongodata/00"
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: mongodb-pv-01
namespace: $NAMESPACE
labels:
type: local
service: mongo
spec:
storageClassName: manual
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/mongodata/01"
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: mongodb-pv-02
namespace: $NAMESPACE
labels:
type: local
service: mongo
spec:
storageClassName: manual
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/mongodata/02"
and then I set up the replica set configuration file in the following way, but it still fails to connect the pods to the volumes:
apiVersion: mongodbcommunity.mongodb.com/v1
kind: MongoDBCommunity
metadata:
name: mongo-rs
namespace: $NAMESPACE
spec:
members: 3
type: ReplicaSet
version: "4.4.0"
persistent: true
podSpec:
persistence:
single:
labelSelector:
matchLabels:
type: local
service: mongo
storage: 5Gi
storageClass: manual
statefulSet:
spec:
volumeClaimTemplates:
- metadata:
name: data-volume
spec:
accessModes: [ "ReadWriteOnce", "ReadWriteMany" ]
resources:
requests:
storage: 5Gi
selector:
matchLabels:
type: local
service: mongo
storageClassName: manual
security:
authentication:
modes: ["SCRAM"]
users:
- ...
additionalMongodConfig:
storage.wiredTiger.engineConfig.journalCompressor: zlib
I can't find any documentation online, except the mongodb.com_v1_custom_volume_cr.yaml, has anyone faced this problem before? How could I make it work?

I think you could be interested into using local type of volumes. It works, like this:
First, you create a storage class for the local volumes. Something like the following:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
Since it has no-provisioner, it will be usable only if you manually create local PVs. WaitForFirstConsumer instead, will prevent attaching a PV to a PVC of a Pod which cannot be scheduled on the host on which the PV is available.
Second, you create the local PVs. Similarly to how you created them in your example, something like this:
apiVersion: v1
kind: PersistentVolume
metadata:
name: example-pv
spec:
capacity:
storage: 5Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: local-storage
local:
path: /path/on/the/host
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- the-node-hostname-on-which-the-storage-is-located
Notice the definition, it tells the path on the host, the capacity.. and then it explains on which node of the cluster, such PV can be used (with the nodeAffinity). It also link them to the storage class we created early.. so that if someone (a claim template) requires storage with that class, it will now find this PV.
You can create 3 PVs, on 3 different nodes.. or 3 PVs on the same node at different paths, you can organize things as you desire.
Third, you can now use the local-storage class in claim template. The claim template could be something similar to this:
volumeClaimTemplates:
- metadata:
name: the-name-of-the-pvc
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "local-storage"
resources:
requests:
storage: 5Gi
And each Pod of the StatefulSet will try to be scheduled on a node with a local-storage PV available.
Remember that with local storages or, in general, with volumes that utilize host paths.. you may want to spread the various Pods of your app on different nodes, so that the app may resist the failure of a single node on its own.
In case you want to be able to decide which Pod links to which volume, the easiest way is to create one PV at a time, then wait for the Pod to Bound with it.. before creating the next one. It's not optimal but it's the easiest way.

Related

kubernetes ignoring persistentvolume

I have created a persistent volume:
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-volume
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "C:/Users/xxx/Desktop/pv"
And want to make save mysql statefulset pods things on it.
So, I wrote the volumeclaimtemplate:
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 1Gi
Thinking this would request the persistent storage from the only persistent volume I have. Instead, this is what happens:
StatefulSets requires you to use storage classes in order to bind the correct PVs with the correct PVCs.
The correct way to make StatefulSets mount local storage is by using local type of volumes, take a look at the procedure below.
First, you create a storage class for the local volumes. Something like the following:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
It has no-provisioner so it will not be able to automatically provision PVs, you'll need to create them manually, but that's exactly what you want for local storage.
Second, you create your local PV, something as the following:
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-volume
spec:
capacity:
storage: 5Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: local-storage
local:
path: "C:/Users/xxx/Desktop/pv"
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- the-node-hostname-on-which-the-storage-is-located
This definition tells the local path on the node, but also forces the PV to be used on a specific node (which match the nodeSelectorTerms).
It also links this PV to the storage class created earlier. This means that now, if a StatefulSets requires a storage with that storage class, it will receive this disk (if the space required is less or equal, of course)
Third, you can now link the StatefulSet:
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "local-storage"
resources:
requests:
storage: 5Gi
When the StatefulSet Pod will need to be scheduled for the first time, the following will happen:
A PVC will be created and it will go Bound with the PV you just created
The Pod will be scheduled to run on the node on which the bounded PV is restricted to run
UPDATE:
In case you want to use hostPath storage instead of local storage (because for example you are on minikube and that is supported out of the box directly, so it's more easy) you need to change the PV declaration a bit, something like the following:
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-volume
spec:
storageClassName: local-storage
accessModes:
- ReadWriteOnce
capacity:
storage: 5Gi
hostPath:
path: /data/pv0001/
Now, the /data directory and all its content is persisted on the host (so if minikube gets restarted, it's still there) but if you want to mount specific directories of your host, you need to use minikube mount, for example:
minikube mount <source directory>:<target directory>
For example, you could do:
minikube mount C:/Users/xxx/Desktop/pv:/host/my-special-pv
and then you could use /host/my-special-pv as the hostPath inside the PV declaration.
More info can be read in the docs.

Unable to setup couchbase operator 1.2 with persistent volume on local storage class

I am trying to setup couchbase operator 1.2 on my local system.
i followed the following steps :
Install the Couchbase Admission Controller.
Deploy the Couchbase Autonomous Operator.
Deploy the Couchbase Cluster.
Access CouchBase from UI.
But the problem with this is that as soon as the system or docker resets or the pod resets, the cluster's data is lost.
So for the same I tried to do it by adding persistent volume with local storage class as mentioned in the docs but the result was still the same. The pod still gets resets. and i am unable to find the reason for the same.
So if anyone can advise on how to do the same with persistent volume on local storage class. I have successfully created a storage class. Just having problem while getting the cluster up and keep the consistency for the same.
Here is the yamls that i used to create the storage class and pv and pv claim
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: myssd
provisioner: local
apiVersion: v1
kind: PersistentVolume
metadata:
name: couchbase-data-2
labels:
type: local
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
storageClassName: myssd
hostPath:
path: "/home/<user>/cb-storage/"
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-test-claim-2
spec:
accessModes:
- ReadWriteOnce
storageClassName: myssd
resources:
requests:
storage: 1Gi
Thanks in advance
Persistent volume using hostPath is not durable. Use a local volume. Compared to hostPath volumes, local volumes can be used in a durable and portable manner without manually scheduling Pods to nodes, as the system is aware of the volume's node constraints by looking at the node affinity on the PersistentVolume.
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: couchbase-data
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: local-storage
local:
path: /home/<User>/cb-storage/
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- node1
- node2
- node3
- node4
You don't need to create a PersistentVolume manually because the storage class will do that internally.
Also you need to configure the local volume provisioner as discussed here so that dynamic provisioning using the local storage class happens.

pod has unbound immediate persistentvolumeclaims after deleting namespace

I have configured the Postgres pod with static provisioning of persistence volume in my local environment . It works fine at the first time but when i delete the namespace and
rerun the pod then its status is pending and give me error
pod has unbound immediate persistentvolumeclaims
I tried to remove the storageClassName from Persistance Volume claim but not works
I also tried to change the storeageclass from manual to block storage but same problem
my yaml file
apiVersion: v1
kind: PersistentVolume
metadata:
name: task-pv-volume
namespace: manhattan
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/opt/manhattan/current/pgdata"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: task-pv-claim
namespace: manhattan
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
apiVersion: v1
kind: Pod
metadata:
name: postgres
namespace: manhattan
spec:
volumes:
- name: task-pv-storage
persistentVolumeClaim:
claimName: task-pv-claim
containers:
- name: dbr-postgres
image: postgres-custome
tty: true
volumeMounts:
- mountPath: "/var/lib/pgsql/9.3/data"
name: task-pv-storage
nodeSelector:
kubernetes.io/hostname: k8s-master
I want my pod to be running even when i delete the namespace and rerun the pod.yaml file
Data will be kept in the kubernetes node because hostpath uses the node filesystem to store the data. The problem is that if you have multiple nodes, then your pod can start on any other node. To solve this, you can either specify the node where you want your pod to start or implement a nfs or glusterfs in your kubernetes nodes. This might be the cause of your problem.
There is one more thing I can think of that might be your issue. When you remove a namespace all the kubernetes resources inside it are removed as well. There is no easy way to recover those. This means that you have to create the pv, pvc and pod in the new namespace.
I solved this issue by using persistentVolumeReclaimPolicy to recycle. Now I can rebound the persistence volume even after deleting the namespace and recreating it
apiVersion: v1
kind: PersistentVolume
metadata:
name: task-pv-volume
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Recycle
hostPath:
path: "/opt/manhattan/current/pgdata"

Kubernetes, how to link a PersistentVolume to a volumeClaim

I'm newbie in the Kubernetes world and I try to figure it out how a volumeClaim or volumeClaimTemplates defined in a StatefulSet can be linked to a specific PersistentVolume.
I've followed some tutorials to understand and set a local PersistentVolume. If I take Elasticsearch as an example, when the StatefulSet starts, the PersistantVolumeClaim is bound to the PersistantVolume.
Like you know, for a local PersistentVolume we must define the local path to the storage destination.
For Elasticsearch I've defined something like this
local:
path: /mnt/kube_data/elasticsearch
But in a real project, there are more than one persistent volume. So, I will have more than one folder in path /mnt/kube_data. How does Kubernetes select the right persistent volume for a persistent volume claim?
I don't want Kubernetes to put Database data in a persistent volume created for another service.
Here is the configuration for Elasticsearch :
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: elasticsearch-sts
spec:
serviceName: elasticsearch
replicas: 1
[...]
containers:
- name: elasticsearch
image: docker.elastic.co/elasticsearch/elasticsearch:6.4.2
volumeMounts:
- name: elasticsearch-data
mountPath: /usr/share/elasticsearch/data
volumeClaimTemplates:
- metadata:
name: elasticsearch-data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: local-storage
resources:
requests:
storage: 10Gi
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-elasticsearch
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: local-storage
local:
path: /mnt/elasticsearch
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: node-role.kubernetes.io/master
operator: Exists
---
You need ClaimRef in the persistent volume definition which have the PVC name to which you want to bind your PV. Also, ClaimRef in PV should have the namespace name where PVC resides because PV's are independent to namespace while PVC aren't. So a same name PVC can exist in two different namespace, hence it is mandatory to provide namespace along with PVC name even when PVC resides in default namespace.
You can refer following answer for PV,PVC and statefulset yaml files for local storage.
Is it possible to mount different pods to the same portion of a local persistent volume?
Hope this helps.

Snapshotting on google cloud/Kubernetes when using storageClass persistent volumes

StorageClasses are the new method of specifying dynamic persistent volume claim (PVC) dependencies within Kubernetes. This avoids the need to explicitly provision one directly with the cloud provider (in my case Google Container Engine (GKE)).
Definition for the StorageClasses (GKE already has a default for standard class)
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: fast
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-ssd
zone: europe-west1-b
Definition for the actual PVC
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nfs-server-pvc
namespace: staging
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 100Gi
storageClassName: "standard"
Here is the result of kubernetes get storageclass:
NAME TYPE
fast kubernetes.io/gce-pd
standard (default) kubernetes.io/gce-pd
Here is the result of kubernetes get pvc:
NAME STATUS VOLUME CAPACITY ACCESSMODES STORAGECLASS AGE
nfs-pvc Bound nfs 1Mi RWX 119d
nfs-server-pvc Bound pvc-905a810b-3f13-11e7-82f9-42010a840072 100Gi RWO standard 81d
I would like to continue taking snapshots of the volumes but the dynamic nature of the volume names created (in this case pvc-905a810b-3f13-11e7-82f9-42010a840072), mean i cannot continue with the following command that i had been using via cron (note the "nfs" name is now incorrect):
gcloud compute --project "XXX-XXX" disks snapshot "nfs" --zone "europe-west1-b" --snapshot-names "nfs-${DATE}"
I guess this boils down to Kubernetes allowing explicit naming through StorageClass-based PVC. The docs don't seem to allow this. Any ideas?
One approach is to manually create the PV and give it a stable name that you can use in your scripts. You can use gcloud commands to create the underlying PD disks. When you create the PV, give it a label:
apiVersion: "v1"
kind: "PersistentVolume"
metadata:
name: my-pv-0
labels:
pdName: my-pv-0
spec:
capacity:
storage: "10Gi"
accessModes:
- "ReadWriteOnce"
storageClassName: fast
gcePersistentDisk:
fsType: "ext4"
pdName: "my-pd-0"
Then attach it to the PVC using a selector:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: my-pvc-0
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: fast
selector:
matchLabels:
pdName: my-pv-0