I'm constantly running into NodeUnderDiskPressure in my pods that are running in Minikube. Using minikube ssh to see df -h, I'm using 50% max on all of my mounts. In fact, one is 50% and the other 5 are <10%.
$ df -h
Filesystem Size Used Avail Use% Mounted on
rootfs 7.3G 503M 6.8G 7% /
devtmpfs 7.3G 0 7.3G 0% /dev
tmpfs 7.4G 0 7.4G 0% /dev/shm
tmpfs 7.4G 9.2M 7.4G 1% /run
tmpfs 7.4G 0 7.4G 0% /sys/fs/cgroup
/dev/sda1 17G 7.5G 7.8G 50% /mnt/sda1
$ df -ih
Filesystem Inodes IUsed IFree IUse% Mounted on
rootfs 1.9M 4.1K 1.9M 1% /
devtmpfs 1.9M 324 1.9M 1% /dev
tmpfs 1.9M 1 1.9M 1% /dev/shm
tmpfs 1.9M 657 1.9M 1% /run
tmpfs 1.9M 14 1.9M 1% /sys/fs/cgroup
/dev/sda1 9.3M 757K 8.6M 8% /mnt/sda1
The probably usually just goes away after 1-5 minutes. Strangely, restarting Minikube doesn't seem to speed up this process. I've tried removing all evicted pods but, again, disk usage doesn't actually look very high.
The docker images I'm using are just under 2GB and I'm trying to spin up just a few of them, so that should still leave me with plenty of headroom.
Here's some kubectl describe output:
$ kubectl describe po/consumer-lag-reporter-3832025036-wlfnt
Name: consumer-lag-reporter-3832025036-wlfnt
Namespace: default
Node: <none>
Labels: app=consumer-lag-reporter
pod-template-hash=3832025036
tier=monitor
type=monitor
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"default","name":"consumer-lag-reporter-3832025036","uid":"342b0f72-9d12-11e8-a735...
Status: Pending
IP:
Created By: ReplicaSet/consumer-lag-reporter-3832025036
Controlled By: ReplicaSet/consumer-lag-reporter-3832025036
Containers:
consumer-lag-reporter:
Image: avery-image:latest
Port: <none>
Command:
/bin/bash
-c
Args:
newrelic-admin run-program python manage.py lag_reporter_runner --settings-module project.settings
Environment Variables from:
local-config ConfigMap Optional: false
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-sjprm (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
default-token-sjprm:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-sjprm
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 15s (x7 over 46s) default-scheduler No nodes are available that match all of the following predicates:: NodeUnderDiskPressure (1).
Is this a bug? Anything else I can do to debug this?
I tried:
Cleaning up evicted pods (with kubectl get pods -a)
Cleaning up unused images
(with minikube ssh + docker images)
Cleaning up all non-running containers (with
minikube ssh + docker ps -a)
The disk usage remained low as shown in my question. I simply recreated a minikube cluster and used the --disk-size flag and this solved my problem. The key thing to note is that even though df showed that I was barely using any disk, it helped to make the disk even bigger.
Related
I am trying to understand master/node concept depoyment in labs.playwithk8s.com https://labs.play-with-k8s.com/
I have two nodes and one master.
It has the following config memory.
node1 ~]$ kubectl describe pod myapp-7f4dffc449-qh7pk
Name: myapp-7f4dffc449-qh7pk
Namespace: default
Priority: 0
Node: node3/192.168.0.16
Start Time: Tue, 07 Feb 2023 12:31:23 +0000
Labels: app=myapp
pod-template-hash=7f4dffc449
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/myapp-7f4dffc449
Containers:
myapp:
Container ID:
Image: changan1111/newdocker:latest
Image ID:
Port: 3000/TCP
Host Port: 0/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Limits:
cpu: 500m
ephemeral-storage: 1Gi
memory: 1Gi
Requests:
cpu: 500m
ephemeral-storage: 1Gi
memory: 1Gi
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-t4nf7 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-t4nf7:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-t4nf7
Optional: false
QoS Class: Guaranteed
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 34s default-scheduler Successfully assigned default/myapp-7f4dffc449-qh7pk to node3
Normal Pulling 31s kubelet Pulling image "changan1111/newdocker:latest"
Warning Evicted 25s kubelet The node was low on resource: ephemeral-storage.
Warning ExceededGracePeriod 15s kubelet Container runtime did not kill the pod within specified grace period.
My yaml file is here: https://raw.githubusercontent.com/changan1111/UserManagement/main/kube/kube.yaml
Looks like that i am not seeing anything wrong.. but still i am seeing The node was low on resource: ephemeral-storage
How to resolve this?
Disk Usage:
overlay 10G 130M 9.9G 2% /
tmpfs 64M 0 64M 0% /dev
tmpfs 16G 0 16G 0% /sys/fs/cgroup
/dev/sdb 64G 29G 36G 44% /etc/hosts
shm 64M 0 64M 0% /dev/shm
shm 64M 0 64M 0% /var/lib/docker/containers/403c120b0dd0909bd34e66d86c58fba18cd71468269e1aaa66e3244d331c3a1e/mounts/shm
shm 64M 0 64M 0% /var/lib/docker/containers/56dd63dad42dd26baba8610f70f1a0bd22fdaea36742c32deca3c196ce181851/mounts/shm
shm 64M 0 64M 0% /var/lib/docker/containers/50c4585ae8cc63de9077c1a58da67cc348c86a6643ca21a06b8998f94a2a2daf/mounts/shm
shm 64M 0 64M 0% /var/lib/docker/containers/6e9529ad6e6a836e77b17c713679abddf861fdc0e86946484dc2ec68a00ca2ff/mounts/shm
tmpfs 16G 12K 16G 1% /var/lib/kubelet/pods/8e56095e-b0ec-4f13-a022-d29d04897410/volumes/kubernetes.io~secret/kube-proxy-token-j7sl8
shm 64M 0 64M 0% /var/lib/docker/containers/2b84d6dfebd4ea0c379588985cd43b623004632e71d63d07a39d521ddf694e8e/mounts/shm
tmpfs 16G 12K 16G 1% /var/lib/kubelet/pods/1271ca18-97d0-48d2-9280-68eb8c57795f/volumes/kubernetes.io~secret/kube-router-token-rmpqv
shm 64M 0 64M 0% /var/lib/docker/containers/c4506095bf36356790795353862fc13b759d72af8edc0e4233341f2d3234fa02/mounts/shm
tmpfs 16G 12K 16G 1% /var/lib/kubelet/pods/39885a73-d724-4be8-a9cf-3de8756c5b0c/volumes/kubernetes.io~secret/coredns-token-ckxbw
tmpfs 16G 12K 16G 1% /var/lib/kubelet/pods/8f137411-3af6-4e44-8be4-3e4f79570531/volumes/kubernetes.io~secret/coredns-token-ckxbw
shm 64M 0 64M 0% /var/lib/docker/containers/c32431f8e77652686f58e91aff01d211a5e0fb798f664ba675715005ee2cd5b0/mounts/shm
shm 64M 0 64M 0% /var/lib/docker/containers/3e284dd5f9b321301647eeb42f9dd82e81eb78aadcf9db7b5a6a3419504aa0e9/mount
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 3m16s default-scheduler Successfully assigned default/myapp-b5856bb-4znkj to node4
Normal Pulling 3m15s kubelet Pulling image "changan1111/newdocker:latest"
Normal Pulled 83s kubelet Successfully pulled image "changan1111/newdocker:latest" in 1m51.97169753s
Normal Created 28s kubelet Created container myapp
Normal Started 27s kubelet Started container myapp
Warning Evicted 1s kubelet Pod ephemeral local storage usage exceeds the total limit of containers 500Mi.
Normal Killing 1s kubelet Stopping container myapp
yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
replicas: 2
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
imagePullSecrets:
- name: dockercreds
containers:
- name: myapp
image: changan1111/newdocker:latest
resources:
limits:
memory: "2Gi"
cpu: "500m"
ephemeral-storage: "2Gi"
requests:
ephemeral-storage: "1Gi"
cpu: "500m"
memory: "1Gi"
ports:
- containerPort: 3000
---
apiVersion: v1
kind: Service
metadata:
name: myapp
spec:
selector:
app: myapp
ports:
- protocol: TCP
port: 80
targetPort: 3000
nodePort: 31110
type: LoadBalancer
Worker nodes may be running out of disk space in which case you should see something like no space left on device or The node was low on resource: ephemeral-storage.
Mitigation is to specify larger disk size for node VMs during Composer environment creation.
Pod eviction and scheduling problems are side effects of Kubernetes limits and requests, usually caused by a lack of planning. See Understanding Kubernetes pod evicted and scheduling problems for more information.
Refer to the similar SO how to set a quota limits.ephemeral-storage, requests.ephemeral-storage to limit this, as otherwise any container can write any amount of storage to its node filesystem.
Warning : Pod ephemeral local storage usage exceeds the total limit of containers 500Mi.
It may be because you're putting an upper limit of ephemeral-storage usage by setting resources.limits.ephemeral-storage to 500Mi. Try removing the limits.ephemeral-storage if safe or change the value depending upon your requirement.
Also see How to determine kubernetes pod ephemeral storage request and limit and how to Avoid running out of ephemeral storage space on your Kubernetes worker Nodes for more information.
I am trying to install the minio storage using kubernetes on my local .
Following the link , However i am facing error with no memory in all types of install ..
I am not sure how to set up the presistantVolume in my case.
https://github.com/minio/operator/blob/master/README.md
I am trying to create persistent volume so that enough memory will be available in the path i am selecting
cat pv.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
kubectl create -f pv.yaml
kubectl get sc
kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
hostpath (default) docker.io/hostpath Delete Immediate false 131m
local-storage kubernetes.io/no-provisioner Delete WaitForFirstConsumer false 56m
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-node
spec:
capacity:
storage: 10Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storage-class: local-storage
local:
path: /mnt/d/minio
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- docker-desktop
kubectl create -f pvc.yaml
error: error parsing pvc.yaml: error converting YAML to JSON: yaml: line 8: mapping values are not allowed in this context
:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
docker-desktop Ready control-plane,master 126m v1.21.2
See 'kubectl get --help' for usage.
:~$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-558bd4d5db-j72z4 1/1 Running 1 128m
kube-system coredns-558bd4d5db-vw98z 1/1 Running 1 128m
kube-system etcd-docker-desktop 1/1 Running 1 128m
kube-system kube-apiserver-docker-desktop 1/1 Running 1 128m
kube-system kube-controller-manager-docker-desktop 1/1 Running 1 128m
kube-system kube-proxy-tqfnr 1/1 Running 1 128m
kube-system kube-scheduler-docker-desktop 1/1 Running 1 128m
kube-system storage-provisioner 1/1 Running 2 127m
kube-system vpnkit-controller 1/1 Running 12 127m
minio-operator console-6b6cf8946c-vxcqh 1/1 Running 0 76m
minio-operator minio-operator-69fd675557-s62nl 1/1 Running 0 76m
:/$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sdb 251G 1.9G 237G 1% /
tmpfs 6.2G 401M 5.8G 7% /mnt/wsl
tools 477G 69G 409G 15% /init
none 6.1G 0 6.1G 0% /dev
none 6.2G 12K 6.2G 1% /run
none 6.2G 0 6.2G 0% /run/lock
none 6.2G 0 6.2G 0% /run/shm
none 6.2G 0 6.2G 0% /run/user
tmpfs 6.2G 0 6.2G 0% /sys/fs/cgroup
C:\ 477G 69G 409G 15% /mnt/c
D:\ 932G 132M 932G 1% /mnt/d
/dev/sdd 251G 2.7G 236G 2% /mnt/wsl/docker-desktop-data/isocache
none 6.2G 12K 6.2G 1% /mnt/wsl/docker-desktop/shared-sockets/host-services
/dev/sdc 251G 132M 239G 1% /mnt/wsl/docker-desktop/docker-desktop-proxy
/dev/loop0 396M 396M 0 100% /mnt/wsl/docker-desktop/cli-tools
I beeilve creating a persistent volume and using that in a namesapce and using that namespace whil creating a tenant should solve this issue. But i am stuck with the error of no memory available
As per the code:
if (memReqSize < minMemReq) {
return {
error: "The requested memory size must be greater than 2Gi",
request: 0,
limit: 0,
};
}
You need 2GB of RAM per node. Since you have 4 nodes, you need 8 GB of RAM for Minio alone. It's likely that you don't have the enough RAM to run this.
I am trying to debug the storage usage in my kubernetes pod. I have seen the pod is evicted because of Disk Pressure. When i login to running pod, the see the following
Filesystem Size Used Avail Use% Mounted on
overlay 30G 21G 8.8G 70% /
tmpfs 64M 0 64M 0% /dev
tmpfs 14G 0 14G 0% /sys/fs/cgroup
/dev/sda1 30G 21G 8.8G 70% /etc/hosts
shm 64M 0 64M 0% /dev/shm
tmpfs 14G 12K 14G 1% /run/secrets/kubernetes.io/serviceaccount
tmpfs 14G 0 14G 0% /proc/acpi
tmpfs 14G 0 14G 0% /proc/scsi
tmpfs 14G 0 14G 0% /sys/firmware
root#deploy-9f45856c7-wx9hj:/# du -sh /
du: cannot access '/proc/1142/task/1142/fd/3': No such file or directory
du: cannot access '/proc/1142/task/1142/fdinfo/3': No such file or directory
du: cannot access '/proc/1142/fd/4': No such file or directory
du: cannot access '/proc/1142/fdinfo/4': No such file or directory
227M /
root#deploy-9f45856c7-wx9hj:/# du -sh /tmp
11M /tmp
root#deploy-9f45856c7-wx9hj:/# du -sh /dev
0 /dev
root#deploy-9f45856c7-wx9hj:/# du -sh /sys
0 /sys
root#deploy-9f45856c7-wx9hj:/# du -sh /etc
1.5M /etc
root#deploy-9f45856c7-wx9hj:/#
As we can see 21G is consumed, but when i try to run du -sh it just returns 227M. I would like to find out who(which directory) is consuming the space
According to the docs Node Conditions, DiskPressure has to do with conditions on the node causing kubelet to evict the pod. It doesn't necessarily mean it's the pod that caused the conditions.
DiskPressure
Available disk space and inodes on either the node’s root filesystem
or image filesystem has satisfied an eviction threshold
You may want to investigate what's happening on the node instead.
Looks like the process 1142 is still running and holding file descriptors and/or perhaps some space (You may have other processes and other file descriptors too not being released) Is it the kubelet?. To alleviate the problem you can verify that it's running and then kill it:
$ ps -Af | grep 1142
$ kill -9 1142
P.D. You need to provide more information about the processes and what's running on that node.
I have created additional volume on my server.
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 19G 3.4G 15G 19% /
devtmpfs 874M 0 874M 0% /dev
tmpfs 896M 0 896M 0% /dev/shm
tmpfs 896M 17M 879M 2% /run
tmpfs 896M 0 896M 0% /sys/fs/cgroup
tmpfs 180M 0 180M 0% /run/user/0
/dev/sdb 25G 44M 24G 1% /mnt/HC_Volume_1788024
How can I attach /dev/sdb either to the whole server (I mean merge it with "/dev/sda1") or assign it to specific directory on server "/var/lib" without overwriting current /var/lib...
Your will not be able to "merge" as you are using standard sdX devices and not using something like LVM for file systems.
As root, you can manually run:
mount /dev/sdb /var/lib/
The original content of /var/lib will still be there (taking up space on your / filesystem)
To make permanent, (carefully) edit your /etc/fstab and add a line like:
/dev/sdb /var/lib FILESYSTEM_OF_YOUR_SDB_DISK defaults 0 0
You will need to replace "FILESYSTEM_OF_YOUR_SDB_DISK" with the correct filesystem type ("df -T", "blkid" or "lsblk -f" will show the type)
You should test the "correctness" of your /etc/fstab by first:
umount /var/lib (if you previously mounted)
Then run:
mount -a then
df -T
and you should see the mount point corrected and the "mount -a" should have not produced any error.
I am trying to throttle the disk I/O of a Docker container using the blkio controller (without destroying the container), but I am unsure how to find out which device to run the throttling on.
The Docker container is running Mongo. Running a df -h inside the bash of the container gives the following:
root#82e7bdc56db0:/# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/docker-202:1-524400-916a3c171357c4f0349e0145e34e7faf60720c66f9a68badcc09d05397190c64 10G 379M 9.7G 4% /
tmpfs 1.9G 0 1.9G 0% /dev
tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup
/dev/xvda1 32G 3.2G 27G 11% /data/db
shm 64M 0 64M 0% /dev/shm
Is there a way to find out which device to limit on the host machine? Thanks!
$docker info
Containers: 9
Running: 9
Paused: 0
Stopped: 0
Images: 6
Server Version: 1.12.1
Storage Driver: devicemapper
Pool Name: docker-202:1-524400-pool
Pool Blocksize: 65.54 kB
Base Device Size: 10.74 GB
Backing Filesystem: xfs
Data file: /dev/loop0
Metadata file: /dev/loop1
Data Space Used: 1.694 GB
Data Space Total: 107.4 GB
Data Space Available: 30.31 GB
Metadata Space Used: 3.994 MB
Metadata Space Total: 2.147 GB
Metadata Space Available: 2.143 GB
Thin Pool Minimum Free Space: 10.74 GB
Udev Sync Supported: true
Deferred Removal Enabled: false
Deferred Deletion Enabled: false
Deferred Deleted Device Count: 0
Data loop file: /var/lib/docker/devicemapper/devicemapper/data
WARNING: Usage of loopback devices is strongly discouraged for production use. Use `--storage-opt dm.thinpooldev` to specify a custom block storage device.
Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
Library Version: 1.02.110 (2015-10-30)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: null host overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: apparmor seccomp
Kernel Version: 4.4.0-38-generic
Operating System: Ubuntu 16.04.1 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 3.675 GiB
Name: ip-172-31-6-72
ID: 4RCS:IMKM:A5ZT:H5IA:6B4B:M3IG:XGWK:2223:UAZX:GHNA:FUST:E5XC
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Insecure Registries:
127.0.0.0/8
df -h on the host machine:
Filesystem Size Used Avail Use% Mounted on
udev 1.9G 0 1.9G 0% /dev
tmpfs 377M 6.1M 371M 2% /run
/dev/xvda1 32G 3.2G 27G 11% /
tmpfs 1.9G 0 1.9G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup
tmpfs 377M 0 377M 0% /run/user/1001