Handling PersistentVolumeClaim in DaemonSet - kubernetes

I have a DaemonSet that creates flink task manager pods, one per each node.
Nodes
Say I have two nodes
node-A
node-B
Pods
the daemonSet would create
pod-A on node-A
pod-B on node-B
Persistent Volume Claim
I am on AKS and want to use azure-disk for Persistent Storage
According to the docs : [https://learn.microsoft.com/en-us/azure/aks/azure-disks-dynamic-pv ]
an azure disk can be associated on to single node
say I create
pvc-A for pv-A attached to node-A
pvc-B for pv-B attached to node-B
Question
How can I associate pod-A on node-A to use pcv-A ?
UPDATE:
After much googling, i stumbled upon that it might be better/cleaner to use a StatefulSet instead. This does mean that you won't get the features available to you via DaemonSet like one pod per node.
https://medium.com/#zhimin.wen/persistent-volume-claim-for-statefulset-8050e396cc51

If you use a persistentVolumeClaim in your daemonset definition, and the persistentVolumeClaim is satisfied with PV with the type of hostPath, your daemon pods will read and write to the local path defined by hostPath. This behavior will help you separate the storage using one PVC.
This might not directly apply to your situation but I hope this helps whoever searching for something like a "volumeClaimTemplate for DaemonSet" in the future.
Using the same example as cookiedough (thank you!)
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: x
namespace: x
labels:
k8s-app: x
spec:
selector:
matchLabels:
name: x
template:
metadata:
labels:
name: x
spec:
...
containers:
- name: x
...
volumeMounts:
- name: volume
mountPath: /var/log
volumes:
- name: volume
persistentVolumeClaim:
claimName: my-pvc
And that PVC is bound to a PV (Note that there is only one PVC and one PV!)
apiVersion: v1
kind: PersistentVolume
metadata:
creationTimestamp: null
labels:
type: local
name: mem
spec:
accessModes:
- ReadWriteOnce
capacity:
storage: 1Gi
hostPath:
path: /tmp/mem
type: Directory
storageClassName: standard
status: {}
Your daemon pods will actually use /tmp/mem on each node. (There's at most 1 daemon pod on each node so that's fine.)

The way to attach a PVC to your DaemonSet pod is not any different than how you do it with other types of pods. Create your PVC and mount it as a volume onto the pod.
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: my-pvc
namespace: x
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
This is what the DaemonSet manifest would look like:
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: x
namespace: x
labels:
k8s-app: x
spec:
selector:
matchLabels:
name: x
template:
metadata:
labels:
name: x
spec:
...
containers:
- name: x
...
volumeMounts:
- name: volume
mountPath: /var/log
volumes:
- name: volume
persistentVolumeClaim:
claimName: my-pvc

Related

Kubernetes Persistent Volume: MountPath directory created but empty

I have 2 pods, one that is writing files to a persistent volume and the other one supposedly reads those files to make some calculations.
The first pod writes the files successfully and when I display the content of the persistent volume using print(os.listdir(persistent_volume_path)) I get all the expected files. However, the same command on the second pod shows an empty directory. (The mountPath directory /data is created but empty.)
This is the TFJob yaml file:
apiVersion: kubeflow.org/v1
kind: TFJob
metadata:
name: pod1
namespace: my-namespace
spec:
cleanPodPolicy: None
tfReplicaSpecs:
Worker:
replicas: 1
restartPolicy: Never
template:
spec:
containers:
- name: tensorflow
image: my-image:latest
imagePullPolicy: Always
command:
- "python"
- "./program1.py"
- "--data_path=./dataset.csv"
- "--persistent_volume_path=/data"
volumeMounts:
- mountPath: "/data"
name: my-pv
volumes:
- name: my-pv
persistentVolumeClaim:
claimName: my-pvc
(respectively pod2 and program2.py for the second pod)
And this is the volume configuration:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
namespace: my-namespace
labels:
type: local
app: tfjob
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-pv
namespace: my-namespace
labels:
type: local
app: tfjob
spec:
storageClassName: manual
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/data"
Does anyone have any idea where's the problem exactly and how to fix it?
When two pods should access a shared Persistent Volume with access mode ReadWriteOnce, concurrently - then the two pods must be running on the same node since the volume can only be mounted on a single node at a time with this access mode.
To achieve this, some form of Pod Affinity must be applied, such that they are scheduled to the same node.

Does GKE Autopilot sometimes kill Pods and is there a way to prevent it for Critical Services?

I've been debugging a 10min downtime of our service for some hours now, and I seem to have found the cause, but not the reason for it. Our redis deployment in kubernetes was down for quite a while, causing neither django nor redis to be able to reach it. This caused a bunch of jobs to be lost.
There are no events for the redis deployment, but here are the first logs before and after the reboot:
before:
after:
I'm also attaching the complete redis yml at the bottom. We're using GKE Autopilot, so I guess something caused the pod to reboot? Resource usage is a lot lower than requested, at about 1% for both CPU and memory. Not sure what's going on here. I also couldn't find an annotation to tell Autopilot to leave a specific deployment alone
redis.yml:
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: redis-disk
spec:
accessModes:
- ReadWriteOnce
storageClassName: gce-ssd
resources:
requests:
storage: "2Gi"
---
apiVersion: v1
kind: Service
metadata:
name: redis
labels:
app: redis
spec:
ports:
- port: 6379
name: redis
clusterIP: None
selector:
app: redis
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis
labels:
app: redis
spec:
replicas: 1
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
volumes:
- name: redis-volume
persistentVolumeClaim:
claimName: redis-disk
readOnly: false
terminationGracePeriodSeconds: 5
containers:
- name: redis
image: redis:6-alpine
command: ["sh"]
args: ["-c", 'exec redis-server --requirepass "$REDIS_PASSWORD"']
resources:
requests:
memory: "512Mi"
cpu: "500m"
ephemeral-storage: "1Gi"
envFrom:
- secretRef:
name: env-secrets
volumeMounts:
- name: redis-volume
mountPath: /data
subPath: data
PersistentVolumeClaim is an object in kubernetes allowing to decouple storage resource requests from actual resource provisioning done by its associated PersistentVolume part.
Given:
no declared PersistentVolume object
and Dynamic Provisioning being enabled on your cluster
kubernetes will try to dynamically provision a suitable persistent disk for you suitable for the underlying infrastructure being a Google Compute Engine Persistent Disk in you case based on the requested storage class (gce-ssd).
The claim will result then in an SSD-like Persistent Disk to be automatically provisioned for you and once the claim is deleted (the requesting pod is deleted due to downscale), the volume is destroyed.
To overcome this issue and avoid precious data loss, you should have two alternatives:
At the PersistentVolumeClaim level
To avoid data loss once the Pod and its PVC are deleted, you can set the persistentVolumeReclaimPolicy parameter to Retain on the PVC object:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: redis-disk
spec:
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: gce-ssd
resources:
requests:
storage: "2Gi"
This allows for the persistent volume to go back to the Released state and the underlying data can be manually backed up.
At the StorageClass level
As a general recommendation, you should set the reclaimPolicy parameter to Retain (default is Delete) for your used StorageClass:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: ssd
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-ssd
reclaimPolicy: Retain
replication-type: regional-pd
volumeBindingMode: WaitForFirstConsumer
Additional parameters are recommended:
replication-type: should be set to regional-pd to allow zonal replication
volumeBindingMode: set to WaitForFirstConsumer to allow for first consumer dictating the zonal replication topology
You can read more on all above StorageClass parameters in the kubernetes documentation.
A PersistentVolume with same storage class name is then declared:
apiVersion: v1
kind: PersistentVolume
metadata:
name: ssd-volume
spec:
storageClassName: "ssd"
capacity:
storage: 2G
accessModes:
- ReadWriteOnce
gcePersistentDisk:
pdName: redis-disk
And the PersistentVolumeClaim would only declare the requested StorageClass name:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: ssd-volume-claim
spec:
storageClassName: "ssd"
accessModes:
- ReadWriteOnce
resources:
requests:
storage: "2Gi"
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis
labels:
app: redis
spec:
replicas: 1
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
volumes:
- name: redis-volume
persistentVolumeClaim:
claimName: ssd-volume-claim
readOnly: false
This objects declaration would prevent any failures or scale down operations from destroying the created PV either created manually by cluster administrators or dynamically using Dynamic Provisioning.

How to use claims as Volumes

Can anybody please tell me how to use the claim as volumes in kubernetes?
Does a vol needs to be created?
Documentation does not give much information about it:
https://kubernetes.io/docs/concepts/storage/persistent-volumes/#claims-as-volumes
A PersistentVolumeClaim allows to bind to an existing PersistentVolume. A PersistentVolume is a representation of a "real" storage device.
You have the detailed lookup algorithm in the following page, section Matching and binding: https://github.com/kubernetes/community/design-proposals/storage/persistent-storage.md
Since it is not very practical to declare each PersistentVolume manually there is an option to use a StorageClass that allows to create a PersistentVolume dynamically.
You can either set the StorageClass in the PersistentVolumeClaim or define a default StorageClass for your cluster.
So when a Pod uses a PersistentVolumeClaim as volume. First a matching PersistentVolume will be searched. If no matching PV can be found and a StorageClass is defined in the claim (or a default StorageClass exists) then a volume will be dynamically created.
You need to create a persistent volume claim which would help you to retain the data even if the pod gets deleted, the volume data is preserved on your server at a particular location, location where u want to preserve your data can be given in deployment.yaml. With the help of persistent volume claim when you recreate new pod, the data will be intact i.e it will the fetch the data from your server(location where u want to intake your data)
Example For Mysql database with persistent volume claim on Kubernetes
PVC.yaml
---
apiVersion: "v1"
kind: "PersistentVolumeClaim"
metadata:
name: "mysqldb-pvc-development"
namespace: "development"
labels:
app: "mysqldb-development"
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
storageClassName: gp2
deployment.yaml
---
apiVersion: "apps/v1"
kind: "Deployment"
metadata:
name: "mysqldb-development"
namespace: "development"
spec:
selector:
matchLabels:
app: "mysqldb-development"
replicas: 1
strategy:
type: "RollingUpdate"
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
minReadySeconds: 5
template:
metadata:
labels:
app: "mysqldb-development"
tier: "mysql"
spec:
containers:
-
name: "mysqldb-development"
image: "mysql_image_name"
imagePullPolicy: "Always"
env:
-
name: "MYSQL_ROOT_PASSWORD"
value: "mysql_password"
ports:
-
containerPort: 3306
name: "mysql"
volumeMounts:
-
name: "mysql-persistent-storage"
mountPath: "/var/lib/mysql"
volumes:
-
name: "mysql-persistent-storage"
persistentVolumeClaim:
claimName: "mysqldb-pvc-development"
imagePullSecrets:
-
name: "mysqldb"
Note:- The ClaimName in deployment.yaml file and Name for pvc.yaml file should be same.

How to persist latest queues after pod recreation

I am trying to run ActiveMQ in Kubernetes. I want to keep the queues even after the pod is terminated and recreated. So far I got the queues to stay even after pod deletion and recreation. But, there is a catch, it seems to be storing the list of queues one previous.
Ex: I create 3 queues a, b, and c. I delete the pod and its recreated. The queue list is empty. I then go ahead and create queues x and y. When I delete and the pod gets recreated, it loads queues a, b, and c. If I add a queue d to it and pod is recreated, it shows x and y.
I have created a configMap like below and
I'm using the config map in my YAML file as well.
kubectl create configmap amq-config-map --from-file=/opt/apache-activemq-
5.15.6/data
apiVersion: apps/v1
kind: Deployment
metadata:
name: activemq-deployment-local
labels:
app: activemq
spec:
replicas: 1
selector:
matchLabels:
app: activemq
template:
metadata:
labels:
app: activemq
spec:
containers:
- name: activemq
image: activemq:1.0
ports:
- containerPort: 8161
volumeMounts:
- name: activemq-data-local
mountPath: /opt/apache-activemq-5.15.6/data
readOnly: false
volumes:
- name: activemq-data-local
persistentVolumeClaim:
claimName: amq-pv-claim-local
- name: config-vol
configMap:
name: amq-config-map
---
apiVersion: v1
kind: Service
metadata:
name: my-service-local
spec:
selector:
app: activemq
ports:
- port: 8161
targetPort: 8161
type: NodePort
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: amq-pv-claim-local
spec:
storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi
---
kind: PersistentVolume
apiVersion: v1
metadata:
name: amq-pv-claim-local
labels:
type: local
spec:
storageClassName: manual
capacity:
storage: 3Gi
accessModes:
- ReadWriteOnce
hostPath:
path: /tmp
When the pod is recreated, I want the queues to stay the same. I'm almost there, but I need some help.
You might be missing a setting in you volume claim:
kind: PersistentVolume
apiVersion: v1
metadata:
name: amq-pv-claim-local
labels:
type: local
spec:
storageClassName: manual
persistentVolumeReclaimPolicy: Retain
capacity:
storage: 3Gi
accessModes:
- ReadWriteOnce
hostPath:
path: /tmp
Also there is still a good change that this does not work due to the use of hostPath: HostPath means it is stored on the server the volume started. It does not migrate along with the restart of the pod, and can lead to very odd behavior in a pv. Look at using NFS, gluster, or any other cluster file system to store your data in a generically accessible path.
If you use a cloud provider, you can also have auto disk mounts from kubernetes, so you can use gcloud, AWS, Azure, etc to provide the storage for you and be mounted by kubernetes where kubernetes wants it be.
With this deployment plan, I'm able to have activemq working in a Kubernetes cluster running in AWS. However, I'm still trying to figure out why it does not work for mysql in the same way.
Simply running
kubectl create -f activemq.yaml
does the trick. Queues are persistent and even terminating the pod and restarting brings up the queues. They remain until the Persistent volume and claim are removed. With this template, I dont need to explicitly create a volume even.
apiVersion: apps/v1
kind: Deployment
metadata:
name: activemq-deployment
labels:
app: activemq
spec:
replicas: 1
selector:
matchLabels:
app: activemq
template:
metadata:
labels:
app: activemq
spec:
securityContext:
fsGroup: 2000
containers:
- name: activemq
image: activemq:1.0
ports:
- containerPort: 8161
volumeMounts:
- name: activemq-data
mountPath: /opt/apache-activemq-5.15.6/data
readOnly: false
volumes:
- name: activemq-data
persistentVolumeClaim:
claimName: amq-pv-claim
---
apiVersion: v1
kind: Service
metadata:
name: amq-nodeport-service
spec:
selector:
app: activemq
ports:
- port: 8161
targetPort: 8161
type: NodePort
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: amq-pv-claim
spec:
#storageClassName: manual
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 2Gi

Bind several Persistent Volume Claims to one mount path

I am working on an application on Kubernetes in GCP and I need a really huge SSD storage for it.
So I created a StorageClass recourse, a PersistentVolumeClaim that requests 500Gi of space and then a Deployment recourse.
StorageClass.yaml:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: faster
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-ssd
PVC.yaml:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mongo-volume
spec:
storageClassName: faster
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Gi
Deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: mongo-deployment
spec:
replicas: 2
selector:
matchLabels:
app: mongo
template:
metadata:
creationTimestamp: null
labels:
app: mongo
spec:
containers:
- image: mongo
name: mongo
ports:
- containerPort: 27017
volumeMounts:
- mountPath: /data/db
name: mongo-volume
volumes:
- name: mongo-volume
persistentVolumeClaim:
claimName: mongo-volume
When I applied the PVC, it stuck in Pending... state for hours. I found out experimentally that it binds correctly with maximum 200Gi of requested storage space.
However, I can create several 200Gi PVCs. Is there a way to bind them to one path to work as one big PVC in Deployment.yaml? Or maybe the 200Gi limit can be expanded?
I have just tested it on my own env and it works perfectly. So the problem is in Quotas.
For this check:
IAM & admin -> Quotas -> Compute Engine API Local SSD (GB) "your region"
Amount which you used.
I've created the situation when I`m run out of Quota and it stack in pending status the same as your.
It happens because you create PVC for each pod for 500GB each.