Monitoring Kubernetes PVC disk usage - kubernetes

I'm trying to monitor Kubernetes PVC disk usage. I need the memory that is in use for Persistent Volume Claim. I found the command:
kubectl get --raw / api / v1 / persistentvolumeclaims
Return:
"status":{
"phase":"Bound",
"accessModes":[
"ReadWriteOnce"
],
"capacity":{
"storage":"1Gi"
}
}
But it only brings me the full capacity of the disk, and as I said I need the used one
Does anyone know which command could return this information to me?

I don't have a definitive anwser, but I hope this will help you. Also, I would be interested if someone has a better anwser.
Get current usage
The PersistentVolume subsystem provides an API for users and administrators that abstracts details of how storage is provided from how it is consumed.
-- Persistent Volume | Kubernetes
As stated in the Kubernetes documentation, PV (PersistentVolume) and PVC (PersistentVolumeClaim) are abstractions over storage. As such, I do not think you can inspect PV or PVC, but you can inspect the storage medium.
To get the usage, create a debugging pod which will use your PVC, from which you will check the usage. This should work depending on your storage provider.
# volume-size-debugger.yaml
kind: Pod
apiVersion: v1
metadata:
name: volume-size-debugger
spec:
volumes:
- name: debug-pv
persistentVolumeClaim:
claimName: <pvc-name>
containers:
- name: debugger
image: busybox
command: ["sleep", "3600"]
volumeMounts:
- mountPath: "/data"
name: debug-pv
Apply the above manifest with kubectl apply -f volume-size-debugger.yaml, and run a shell inside it with kubectl exec -it volume-size-debugger sh. Inside the shell run du -sh to get the usage in a human readable format.
Monitoring
As I am sure you have noticed, this is not especially useful for monitoring. It may be useful for a one-time check from time to time, but not for monitoring or low disk space alerts.
One way to setup monitoring would be to have a similar sidecar pod like ours above and gather our metrics from there. One such example seems to be the node_exporter.
Another way would be to use CSI (Container Storage Interface). I have not used CSI and do not know enough about it to really explain more. But here are a couple of related issues and related Kubernetes documentation:
Monitoring Kubernetes PersistentVolumes - prometheus-operator
Volume stats missing - csi-digitalocean
Storage Capacity | Kubernetes

+1 to touchmarine's answer however I'd like to expand it a bit and add also my three cents.
But it only brings me the full capacity of the disk, and as I said I
need the used one
PVC is an abstraction which represents a request for a storage and simply doesn't store such information as disk usage. As a higher level abstraction it doesn't care at all how the underlying storage is used by its consumer.
#touchmarine, Instead of using a Pod whose only function is to sleep and every time you need to check the disk usage you need to attach to it maually, I would propose to use something like this:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
volumes:
- name: media
persistentVolumeClaim:
claimName: media
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
volumeMounts:
- mountPath: "/data"
name: media
- name: busybox
image: busybox
command: ["/bin/sh"]
args: ["-c", "while true; do du -sh /data; sleep 10;done"]
volumeMounts:
- mountPath: "/data"
name: media
It can be of course a single-container busybox Pod as in #touchmarine's example but here I decided to to show also how it can be used as a sidecar running next to nginx container within a single Pod.
As it runs a simple bash script - an infinite while loop, which prints out current disk usage to the standard output it can be read with kubectl logs without a need of using kubectl exec and attaching to the Pod:
$ kubectl logs nginx-deployment-56bb5c87f6-dqs5h busybox
20.0K /data
20.0K /data
20.0K /data
I guess it can be also used more effectively to configure some sort of monitoring of disk usage.

Related

Why is my Host Path Persistent Volume reachable from all pods?

I'm pretty stuck with this learning step of Kubernetes named PV and PVC.
What I'm trying to do here is understand how to handle shared read-write volume on multiple pods.
What I understood here is that a PVC cannot be shared between pods unless a NFS-like storage class has been configured.
I'm still with my hostPath Storage Class and I tried the following (Docker Desktop and 3 nodes microK8s cluster) :
This PVC with dynamic Host Path provisionning
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-desktop
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Mi
Deployment with 3 replicated pods writing on the same PVC.
apiVersion: apps/v1
kind: Deployment
metadata:
name: busybox
spec:
replicas: 3
selector:
matchLabels:
app: busybox
template:
metadata:
labels:
app: busybox
spec:
containers:
- name: busybox
image: library/busybox:stable
command: ["/bin/sh"]
args:
["-c", 'while true; do echo "1: $(hostname)" >> /root/index.html; sleep 2; done;',]
volumeMounts:
- mountPath: /root
name: vol-desktop
volumes:
- name: vol-desktop
persistentVolumeClaim:
claimName: pvc-desktop
Nginx server for serving volume content
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:stable
volumeMounts:
- mountPath: /usr/share/nginx/html
name: vol-desktop
ports:
- containerPort: 80
volumes:
- name: vol-desktop
persistentVolumeClaim:
claimName: pvc-desktop
Following what I understood on the documentation, this could not be possible, but in reality everything run pretty smoothly and my Nginx server displayed the up to date index.html file pretty well.
It actually worked on a single-node cluster and multi-node cluster.
What am I not getting here? Why this thing works?
Is every pod mounting is own host path volume on start?
How can a hostPath storage works between multiple nodes?
EDIT: For the multi-node case, a network folder has been created between the same storage path of each machine this is why everything has been replicated successfully. I didn't understand that the same host path is created on each node with that PVC mounted.
To anyone with the same problem: each node mounting this hostpath PVC will have is own folder created at the PV path.
So without network replication between nodes, only pods of the same node will share the same folder.
This is why it's discouraged on a multi-node cluster due to the unpredictable location of a pod on the cluster.
Thanks!
how to handle shared read-write volume on multiple pods.
Redesign your application to avoid it. It tends to be fragile and difficult to manage multiple writers safely; you depend on both your application correctly performing things like file locking, the underlying shared filesystem implementation handling things properly, and the system being tolerant of any sort of network hiccup that might happen.
The example you give is something that frequently appears in Docker Compose setups: have an application with a mix of backend code and static files, and then try to publish the static files at runtime through a volume to a reverse proxy. Instead, you can build an image that copies the static files at build time:
FROM nginx
ARG app_version=latest
COPY --from=my/app:${app_version} /app/static /usr/share/nginx/html
Have your CI system build this and push it immediately after the backend image is built. The resulting image serves the corresponding static files, but doesn't require a shared volume or any manual management of the volume contents.
For other types of content, consider storing data in a database, or use an object-storage service that maintains its own backing store and can handle the concurrency considerations. Then most of your pods can be totally stateless, and you can manage the data separately (maybe even outside Kubernetes).
How can a hostPath storage works between multiple nodes?
It doesn't. It's an instruction to Kubernetes, on whichever node the pod happens to be scheduled on, to mount that host directory into the container. There's no management of any sort of the directory content; if two pods get scheduled on the same node, they'll share the directory, and if not, they won't; and if your pod's Deployment is updated and the pod is deleted and recreated somewhere else, it might not be the same node and might not have the same data.
With some very specific exceptions you shouldn't use hostPath volumes at all. The exceptions are things like log collectors run as DaemonSets, where there is exactly one pod on every node and you're interested in picking up the host-directory content that is different on each node.
In your specific setup either you're getting lucky with where the data producers and consumers are getting colocated, or there's something about your MicroK8s setup that's causing the host directories to be shared. It is not in general reliable storage.

Best practice on reading and writing to a persistent Volume in a live Kubernetes Cluster

I am designing a Kubernetes system which will require storing audio files. To do this I would like to setup a persistent storage volume making use of a stateful set.
I have found a few tutorials on how to set something like this up, but I am unsure once I have created it how to read/write the files. What would be the best approach to do this. I will be using a flask app, but if I could just get a high level approach then I can find the exact libraries myself.
Not acknowledging on facts how it should be implemented programming wise and the specific tuning for dealing with audio files, you can use your Persistent Volume the same as you would read/write data to a directory (as correctly pointed by user #zerkms in the comments).
Answering this specific part of the question:
but I am unsure once I have created it how to read/write the files.
Assuming that you've created your StatefulSet in a following way:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: ubuntu-sts
spec:
selector:
matchLabels:
app: ubuntu # has to match .spec.template.metadata.labels
serviceName: "ubuntu"
replicas: 1
template:
metadata:
labels:
app: ubuntu
spec:
terminationGracePeriodSeconds: 10
containers:
- name: ubuntu
image: ubuntu
command:
- sleep
- "infinity"
volumeMounts:
- name: audio-volume
mountPath: /audio
volumeClaimTemplates:
- metadata:
name: audio-volume
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: standard
resources:
requests:
storage: 1Gi
Take a look on below part (it's showing where your Volume will be mounted):
volumeMounts:
- name: audio-volume
mountPath: /audio # <-- HERE!
Disclaimer!
This example is having the 1:1 Pod to Volume relation. If your use case is different you will need to refer to the Kubernetes documentation about accessModes.
You can exec into this Pod to look how you can further develop your application:
$ kubectl exec -it ubuntu-sts-0 -- /bin/bash
$ echo "Hello from your /audio directory!" > /audio/hello.txt
$ cat /audio/hello.txt
root#ubuntu-sts-0:/# cat /audio/hello.txt
Hello from your /audio directory!
A side note!
If it happens that you are using the cloud-provider managed Kubernetes cluster like GKE, EKS or AKS, please refer to it's documentation about storage options.
I encourage you to check the official documentation on Persistent Volumes:
Kubernetes.io: Docs: Concepts: Storage: Persistent Volumes
Also, please take a look on documentation regarding Statefulset:
Kubernetes.io: Docs: Concepts: Workloads: Controllers: Statefulset
Additional resources:
Guru99.com: Reading and writing files in Python

How to write to gcePersistentDisk while mounted to multiple Kubernetes pod

I'm currently mounting a gcePersistentDisk to each pod in my kubernetes deployment. Since I want multiple pods to read from the disk, I have to mount it as read only. My deployment yaml file looks like this:
apiVersion: extensions/v1beta1
kind: Deployment
spec:
replicas: 1
...
...
template:
...
...
spec:
containers:
- image: ...
...
...
volumeMounts:
- mountPath: /my-volume
name: my-volume
readOnly: true
...
...
volumes:
- name: my-storage
gcePersistentDisk:
pdName: my-disk
fsType: ext4
readOnly: true
Right now, in order to write new stuff to the disk, I need to scale the deployment to 0, then start a kubernetes job that mounts the disk to a single pod that has read / write access, write to the disk and then scale the deployment up again.
Is there a way I can do this without taking down all my pods?
Is it possible/recommended to do something like "hot-swapping" persistent disks in kubernetes deployments?
Looking at the requirements:
1)- No other choice with the current use-case. Pods need to be scaled down every time.
2)- You can use a different type of PV, then use ReadWriteMany access mode [1] & [2].
3)- hot-swap: meaning changing the deployment (kubectl apply)? Not sure, need clarification.
4)- Another option is to use NFS [2], but that obviously is a whole different approach.
[1] https://cloud.google.com/kubernetes-engine/docs/concepts/persistent-volumes#access_modes
[2] Access Modes https://kubernetes.io/docs/concepts/storage/persistent-volumes/

Is NFS hard- or soft-mounted to Kubernetes Pods?

Is NFS hard- or soft-mounted when making a Pod in Kubernetes with an NFS-volume?
As I understand this might have an impact on how it handles a timeout?
Example yaml:
apiVersion: v1
kind: Pod
metadata:
name: nfs-web
spec:
containers:
- name: web
image: nginx
ports:
- name: web
containerPort: 80
volumeMounts:
- name: nfs
mountPath: "/usr/share/nginx/html"
volumes:
- name: nfs
nfs:
server: nfs-server.default.kube.local
path: "/"
I believe that the NFS mounts inside of a POD will use the defaults provided by the implementation of NFS in the container OS. I can't be 100% certain (I'm not deeply familiar with the code), but in my experience the NFS mounts are mounted with the hard option, which is default in most implementations of NFS (see man nfs for more details on your OS; soft is often considered dangerous.)
The NFSVolumeSource struct doesn't appear to have the ability to know about mount settings (except read-only) and I don't see any hard-coded options in the NFS volume code.
You can check on your own PODs with something like this to gather the NFS options in use:
$ kubectl exec nfs-web-<XXXXX> -c web -- mount|grep nfs
its always "hard" mounted by default...however you can change it to soft by explicitly passing the additional annotation in PV definition.
volume.beta.kubernetes.io/mount-options: soft

Unable to mount Amazon Web Services (AWS) EBS Volume into Kubernetes pod

I created a volume using the following command.
aws ec2 create-volume --size 10 --region us-east-1 --availability-zone us-east-1c --volume-type gp2
Then I used the file below to create a pod that uses the volume. But when I login to the pod, I don't see the volume. Is there something that I might be doing wrong? Did I miss a step somewhere? Thanks for any insights.
---
kind: "Pod"
apiVersion: "v1"
metadata:
name: "nginx"
labels:
name: "nginx"
spec:
containers:
-
name: "nginx"
image: "nginx"
volumeMounts:
- mountPath: /test-ebs
name: test-volume
volumes:
- name: test-volume
# This AWS EBS volume must already exist.
awsElasticBlockStore:
volumeID: aws://us-east-1c/vol-8499707e
fsType: ext4
I just stumbled across the same thing and found out after some digging, that they actually changed the volume mount syntax. Based on that knowledge I created this PR for documentation update. See https://github.com/kubernetes/kubernetes/pull/17958 for tracking that and more info, follow the link to the bug and the original change which doesn't include the doc update. (SO prevents me from posting more than two links apparently.)
If that still doesn't do the trick for you (as it does for me) it's probably because of https://stackoverflow.com/a/32960312/3212182 which will be fixed in one of the next releases I guess. At least I can't see it in the latest release notes.
Ensure that the volume is in the same availability zone as the node.
http://kubernetes.io/docs/user-guide/volumes/
If that's not the issue, are there any events in kubectl describe pod nginx?