how to solve the disk pressure in kubernetes - kubernetes

I have a local OpenNESS network edge cluster using Kubernetes as its infrastructure management.
I'm facing the disk pressure issue due to which pods are getting Evicted and in CrashLoopBack state.
Also, the images from worker-node went missing(got deleted automatically)
If I check the disk usage, I see 83% been used by the dev/sda4 or overlay filesystem.
how to solve this issue.
image attached shows the disk usage

Your disk usage chart reveals a lot of disk usage on the overlay filesystem, so by Docker containers union file system. This suggests that you are having some large containers running. Those might have been large to start with or be writing binary data to the container file system while running.
To get to the bottom of this, you can either have a look into at your monitoring (if present). Or, your can ssh into the affected node and try to identify the "guilty" pod with:
du --max-depth=1 /var/lib/docker/overlay2/ | sort -n
and a subsequent: du | sort -n in the biggest folder.

Related

Kubernetes: avoid pod being evicted by DiskPressure

I am looking for best practices to avoid Pod The node had condition: [DiskPressure].
So what I'm doing is full database export of all our views which is massive. At some point the pod runs into DiskPressure error and the k8 decides to Evict and kill it.
What would be the best practices to handle this? There is 7GB of free space which maybe is not enough. Is just raising that the best way to go about it or are the other mechanisms to handle this type of work?
Hope my question makes sense
Error message Pod The node had a condition: [DiskPressure]
happens when the kubelet agent won't admit new pods on the node, that means they won't start. Node disk pressure means that the disks that are attached to the node are under pressure.
The reason you might run into node disk pressure is because Kubernetes has not cleaned up any unused images and is a problem of logs building up.if you have a long-running container with a lot of logs, they may build up enough that it overloads the capacity of the node disk.
Troubleshooting Node Disk Pressure:
To troubleshoot the issue of node disk pressure, you need to figure out what files are taking up the most space. You can either manually SSH into each Kubernetes node, or use a DaemonSet, you can do that from this link.
After installing you can start looking at the logs of the pods that are running by executing kubectl logs -l app=disk-checker. You will see a list of files and their sizes, which will give you greater insight into what is taking up space on your nodes.
Possible solutions:
The issue is caused by necessary application data, making it impossible to delete the files. In this case, you will have to increase the size of the node disks to ensure that there’s sufficient room for the application files.
Another solution is that you find applications that have produced a lot of files that are no longer needed and simply delete the unnecessary files.
Adding more for your information:
1)To avoid DiskPressure crashing the node :
DiskPressure triggers when either node root file systems or image file systems satisfies an eviction threshold for available disk space, inodes will trigger DiskPressure which in turn causes pod eviction,refer to these Node conditions.
Based on the Node conditions, you should consider adjusting the parameters of your kubelet, --image-gc-high-threshold and --image-gc-low-threshold, so that there is always enough space for normal operations, consider --low-diskspace-threshold-mb provisioning more space for your nodes, depending on your requirements.
2) To reduce the DiskPressure condition
Use the kubelet command line arg :
--eviction-hard mapStringString: A set of eviction thresholds (e.g. memory.available<1Gi) that if met would trigger a pod eviction.
DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag. See Set Kubelet parameters via a config file for more information.

How to see what a k8s container is writing to ephemeral storage

One of our containers is using ephemeral storage but we don't know why. The app running in the container shouldn't be writing anything to the disk.
We set the storage limit to 20MB but it's still being evicted. We could increase the limit but this seems like a bandaid fix.
We're not sure what or where this container is writing to, and I'm not sure how to check that. When a container is evicted, the only information I can see is that the container exceeded its storage limit.
Is there an efficient way to know what's being written, or is our only option to comb through the code?
Adding details to the topic.
Pods use ephemeral local storage for scratch space, caching, and logs.
Pods can be evicted due to other pods filling the local storage, after which new pods are not admitted until sufficient storage has been reclaimed.
The kubelet can provide scratch space to Pods using local ephemeral storage to mount emptyDir volumes into containers.
For container-level isolation, if a container's writable layer and log usage exceeds its storage limit, the kubelet marks the Pod for eviction.
For pod-level isolation the kubelet works out an overall Pod storage limit by summing the limits for the containers in that Pod. In this case, if the sum of the local ephemeral storage usage from all containers and also the Pod's emptyDir volumes exceeds the overall Pod storage limit, then the kubelet also marks the Pod for eviction.
To see what files have been written since the pod started, you can run:
find / -mount -newer /proc -print
This will output a list of files modified more recently than '/proc'.
/etc/nginx/conf.d
/etc/nginx/conf.d/default.conf
/run/secrets
/run/secrets/kubernetes.io
/run/secrets/kubernetes.io/serviceaccount
/run/nginx.pid
/var/cache/nginx
/var/cache/nginx/fastcgi_temp
/var/cache/nginx/client_temp
/var/cache/nginx/uwsgi_temp
/var/cache/nginx/proxy_temp
/var/cache/nginx/scgi_temp
/dev
Also, try without the '-mount' option.
To see if any new files are being modified, you can run some variations of the following command in a Pod:
while true; do rm -f a; touch a; sleep 30; echo "monitoring..."; find / -mount -newer a -print; done
and check the file size using the du -h someDir command.
Also, as #gohm'c pointed out in his answer, you can use sidecar/ephemeral debug containers.
Read more about Local ephemeral storage here.
We're not sure what or where this container is writing to, and I'm not sure how to check that.
Try look into the container volumeMounts section that is mounted with emptyDir, then add a sidecar container (eg. busybox) to start a shell session where you can check the path. If your cluster support ephemeral debug container you don't need the sidecar container.

Prometheus + Longhorn = wrong volume size

I am not really sure, if this is a prometheus issue, or just Longhorn, or maybe a combination of the two.
Setup:
Kubernetes K3s v1.21.9+k3s1
Rancher Longhorn Storage Provider 1.2.2
Prometheus Helm Chart 32.2.1 and image: quay.io/prometheus/prometheus:v2.33.1
Problem:
Infinitely growing PV in Longhorn, even over the defined max size. Currently using 75G on a 50G volume.
Description:
I have a really small 3 node cluster with not too many deployments running. Currently only one "real" application and the rest is just kubernetes system stuff so far.
Apart from etcd, I am using all the default scraping rules.
The PV is filling up a bit more than 1 GB per day, which seems fine to me.
The problem is, that for whatever reason, the data used inside longhorn is infinitely growing. I have configured retention rules for the helm chart with a retention: 7d and retentionSize: 25GB, so the retentionSize should never be reached anyway.
When I log into the containers shell and do a du -sh in /prometheus, it shows ~8.7GB being used, which looks good to me as well.
The problem is that when I look at the longhorn UI, the used spaced is growing all the time. The PV does exist now for ~20 days and is currently using almost 75GB of a defined max of 50GB. When I take a look at the Kubernetes node itself and inspect the folder, which longhorn uses to store its PV data, I see the same values of space being used as in the Longhorn UI, while inside the prometheus container, everything looks good to me.
I hope someone has an idea what the problem could be. I have not experienced this issue with any other deployment so far, all others are good and really decrease in size used, when something inside the container gets deleted.

How to get iostats of a container running in a pod on Kubernetes?

Memory and cpu resources of a container can be tracked using prometheus. But can we track I/O of a container? Are there any metrices available?
If you are using Docker containers you can check the data with the docker stats command (as P... mentioned in the comment). Here you can find more information about this command.
If you want to check pods cpu/memory usage without installing any third party tool then you can get memory and cpu usage of pod from cgroup.
Go to pod's exec mode kubectl exec pod_name -- /bin/bash
Go to cd /sys/fs/cgroup/cpu for cpu usage run cat cpuacct.usage
Go to cd /sys/fs/cgroup/memory for memory usage run cat memory.usage_in_bytes
For more look at this similar question.
Here you can find another interesting question. You should know, that
Containers inside pods partially share /procwith the host system include path about a memory and CPU information.
See also this article about Memory inside Linux containers.

Is Kubernetes monitoring data garbage collected?

I recently had to disable the fluentd-elasticsearch Kubernetes addon because it ended up eating all the disk space on one of my minions which in turn prevented an important pod from starting.
I am now worried that the monitoring addon might end up eating disk space as well. Is the monitoring data (stored in influxdb) ever garbage collected or does it keep eating away at disk space? Are there other Kubernetes components that eat up disk space indefinitely?
I setup my cluster using ./cluster/kube-up.sh on AWS.
Client Version: version.Info{Major:"1", Minor:"2",
GitVersion:"v1.2.4",
GitCommit:"3eed1e3be6848b877ff80a93da3785d9034d0a4f",
GitTreeState:"clean"} Server Version: version.Info{Major:"1",
Minor:"2", GitVersion:"v1.2.4",
GitCommit:"3eed1e3be6848b877ff80a93da3785d9034d0a4f",
GitTreeState:"clean"}
To answer your specific question: you should lookin for pods using emptydir (kubectl get po --all-namespaces -o yaml | grep emptyDir) or hostPath.
As a general policy:
If you use a pv it should be limited by the space available on the pv. Such a pv is usually backed by cloudprovider or nfs and is mounted over the network.
If you use "emptyDir", your storage is taken out of kubelet's --root-dir. Depending on the distribution/setup, this might be an isolated partition making it impossible for a rogue app to take down the node.
If you use hostPath, you are explicitly choosing a path on the node. If you're running without enough privileges to claim sensitive portions of the filesystem and fill them with data, the node goes down.
There's work in the logging front to make this better: https://github.com/kubernetes/kubernetes/issues/17183
There is also image/container GC, which kicks in if your disk usage is above a threshold. You should check if the version of kubernetes you're using had GC issues (will be mentioned in the release notes).