Unable to launch new pods despite resources seemingly available - kubernetes

I'm unable to launch new pods despite resources seemingly being available.
Judging from the below screenshot there should be room for about 40 new pods.
And also judging from the following screenshot the nodes seems fairly underutilized
However I'm currently facing the below error message
0/3 nodes are available: 1 Insufficient cpu, 2 node(s) had volume node affinity conflict.
And last night it was the following
0/3 nodes are available: 1 Too many pods, 2 node(s) had volume node affinity conflict.
Most of my services require very little memory and cpu. And therefore their resources are configured as seen below
resources:
limits:
cpu: 100m
memory: 64Mi
requests:
cpu: 100m
memory: 32Mi
Why I can't deploy more pods? And how to fix this?

Your problem is "volume node affinity conflict".
From Kubernetes Pod Warning: 1 node(s) had volume node affinity conflict:
The error "volume node affinity conflict" happens when the persistent volume claims that the pod is using are scheduled on different zones, rather than on one zone, and so the actual pod was not able to be scheduled because it cannot connect to the volume from another zone.
First, try to investigate exactly where the problem is. You can find a detailed guide here. You will need commands like:
kubectl get pv
kubectl describe pv
kubectl get pvc
kubectl describe pvc
Then you can delete the PV and PVC and move pods to the same zone along with the PV and PVC.

volume node affinity conflict - the volume you tried to mount is not available on any of the node. You can resolve this or paste your volumes section to the question for further examination.

Related

Guaranteed Application pods is in Pending state when dummy guaranteed pods are Running

In my project, I need to test if Guaranteed Application pods should evict any dummy application pods which are running. How do I achieve that application pods always have the highest priority?
The answer provided by the P.... is very good and useful. By Pod Priority and Preemption you can achieve what you are up to.
However, apart from that, you can use dedicated solutions, for example in the clouds. Look at the Google cloud example:
Before priority and preemption, Kubernetes pods were scheduled purely on a first-come-first-served basis, and ran to completion (or forever, in the case of pods created by something like a Deployment or StatefulSet). This meant less important workloads could block more important, later-arriving, workloads from running—not the desired effect. Priority and preemption solves this problem.
Priority and preemption is valuable in a number of scenarios. For example, imagine you want to cap autoscaling to a maximum cluster size to control costs, or you have clusters that you can’t grow in real-time (e.g., because they are on-premises and you need to buy and install additional hardware). Or you have high-priority cloud workloads that need to scale up faster than the cluster autoscaler can add nodes. In short, priority and preemption lead to better resource utilization, lower costs and better service levels for critical applications.
Additional guides for other clouds:
IBM cloud
AWS cloud
Azure cloud
RedHat Openshift
See also this useful tutorial.
You can make use of kind: PriorityClass to give your application a higher priority then normal pods.
The eviction is based on the priority of the Pod, QoS, and the actual usage of the Pod. If the Pod belongs to a higher priority pod, its creation will preempt the bestEffort, followed by burstable followed by guaranteed pods.
For example: In my cluster, I have the following priority classes:
kubectl get priorityclasses.scheduling.k8s.io
NAME VALUE GLOBAL-DEFAULT AGE
k8s-cluster-critical 1000000000 false 11d
system-cluster-critical 2000000000 false 11d
system-node-critical 2000001000 false 11d
For the sake of example, I used the system-cluster-critical class. Do not do this, have your priority class. The following Pod would lead to the eviction of other pods.
---
apiVersion: v1
kind: Pod
metadata:
name: guaranteed
spec:
nodeName: kube-worker-1
priorityClassName: system-cluster-critical
containers:
- name: app
image: nginx
resources:
requests:
cpu: 1000m
memory: 300Mi
limits:
cpu: 1000m
memory: 300Mi
In the description of the other pods, you would see the following:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulling 3m19s kubelet Pulling image "nginx"
Normal Pulled 3m19s kubelet Successfully pulled image "nginx" in 495.693296ms
Normal Created 3m19s kubelet Created container app
Normal Started 3m19s kubelet Started container app
Warning Preempting 18s kubelet Preempted in order to admit critical Pod
Normal Killing 18s kubelet Stopping container app
Note that if in your cluster there is no priority class with global default then the priority of the pods without any priority class would be zero(minimum priority). So based on the type of application you should have multiple priority classes created and used.

Pods getting scheduled irrespective of insufficient resources

Upon submitting few jobs (say, 50) targeted on a single node, I am getting pod status as "OutOfpods" for few jobs. I have reduced the maximum number of pods on this worker node to "10", but still observe above issue.
Kubelet configuration is default with no changes.
kubernetes version: v1.22.1
Worker Node
Os: CentOs 7.9
memory: 528 GB
CPU: 40 cores
kubectl describe pod :
Warning OutOfpods 72s kubelet Node didn't have enough resource: pods, requested: 1, used: 10, capacity: 10
I have realized this to be a known issue for kubelet v1.22 as confirmed here. The fix will be reflected in the next latest release.
Simple resolution here is to downgrade kubernetes to v1.21.
I'm seeing this problem as well w/ K8s v1.22. I'm scheduling around 100 containers w/ one node with an extended resource called "executors" and a capacity of 300 per node. Each container requests 10. The pods stay pending for a long time, but as soon as they are assigned by the scheduler, kubelet on the node says its out of resource. Its just a warning I suppose, but it actually leads to "failed" status on the pod atleast briefly. I have to check whether its re-created as pending or not.
Normal Scheduled 40m default-scheduler Successfully assigned ci-smoke/userbench-4a306d7-l1all-8zv7n-3803535768 to sb-bld-srv-39
Warning OutOfwdc.com/executors 40m kubelet Node didn't have enough resource:wdc.com/executors, requested: 10, used: 300, capacity: 300```

Kubernetes Pod won't start - 1 node(s) had a volume affinity conflict

I have a pod that won't start with a volume affinity conflict. This is a bare-metal cluster so it's unrelated to regions. The pod has 4 persistent volume claims which are all reporting bound so I'm assuming it's not one of those. There are 4 nodes, one of them is tainted so that the pod will not start on it, one of them is tainted specifically so that the pod WILL start on it. That's the only affinity I have set up to my knowledge. The message looks like this:
0/4 nodes are available: 1 node(s) had taint {XXXXXXX},
that the pod didn't tolerate, 1 node(s) had volume node
affinity conflict, 2 Insufficient cpu, 2 Insufficient memory.
This is what I would have expected apart from the volume affinity conflict. There are no other affinities set other than to point it at this node. I'm really not sure why it's doing this or where to even begin. The message isn't super helpful. It does NOT say which node or which volume there is a problem with. The one thing I don't really understand is how binding works. One of the PVC's is mapped to a PV on another node however it is reporting as bound so I'm not completely certain if that's the problem. I am using local-storage as the storage class. I'm wondering if that's the problem but I'm fairly new to Kubernetes and I'm not sure where to look.
You got 4 Nodes but none of them are available for scheduling due to a different set of conditions. Note that each Node can be affected by multiple issues and so the numbers can add up to more than what you have on total nodes. Let's try to address these issues one by one:
Insufficient memory: Execute kubectl describe node <node-name> to check how much free memory is available there. Check the requests and limits of your pods. Note that Kubernetes will block the full amount of memory a pod requests regardless how much this pod uses.
Insufficient cpu: Analogical as above.
node(s) had volume node affinity conflict: Check out if the nodeAffinity of your PersistentVolume (kubectl describe pv) matches the node label (kubectl get nodes). Check if the nodeSelector in your pod also matches. Make sure you set up the Affinity and/or AntiAffinity rules correctly. More details on that can be found here.
node(s) had taint {XXXXXXX}, that the pod didn't tolerate: You can use kubectl describe node to check taints and kubectl taint nodes <node-name> <taint-name>- in order to remove them. Check the Taints and Tolerations for more details.

Node has no available volume zone in AWS EKS

Trying to create pod but getting following error:
0/3 nodes are available: 1 node(s) had no available volume zone.
I tried to attach more volume but still the error is same.
Warning FailedScheduling 2s (x14 over 42s) default-scheduler 0/3 nodes are available: 1 node(s) had no available volume zone, 2 node(s) didn't have free ports for the requested pod ports.
My problem was that the AWS EC2 Volume and Kubernetes PersistentVolume (PV) state got somehow out of sync / corrupted. Kubernetes believed there was a bound PV while the EC2 Volume showed as "available", not mounted to a worker node.
Update: The volume was in a different avail. zone then either of the 2 EC2 nodes and thus could not be attached to them.
The solution was to delete all relevant resources - StatefulSet, PVC (crucial!), PV. Then I was able to apply them again and Kubernetes succeeded in creating a new EC2 Volume and attaching it to the instance.
As you can see in my configuration, I have a StatefulSet with a "volumeClaimTemplate" (=> PersistentVolumeClaim, PVC) (and a matching StorageClass definition) so Kubernetes should dynamically provision an EC2 Volume, attach it to a worker and expose it as a PersistentVolume.
See kubectl get pvc, kubectl get pv and in the AWS Console - EC2 - Volumes.
NOTE: "Bound" = the PV is bound to the PVC.
Here is a description of a laborious way to restore a StatefulSet on AWS if you have a snapshot of the EBS volume (5/2018): https://medium.com/#joatmon08/kubernetes-statefulset-recovery-from-aws-snapshots-8a6159cda6f1

How to find out the minimum and maximum usable CPU and memory space left on a kubernetes node

I'm trying to deploy Magento on a GCE n1-standard-1 machine, but I keep getting the following error message.
pod (magento-magento-1486272877-zd34d) failed to fit in any node fit failure summary on nodes : Insufficient cpu (1)
I'm using the official Magento helm chart, and I've configured the values.yml file to contain very low CPU requests: cpu: 25m
When I look at the node details on the kubernetes dashboard, I see that my CPU is already spinning at 0.728 (72.80%) while it's not even doing anything besides the system containers. Also see image below:
Does this mean I have 1 - 0.728 = 0.272m left for container requests? Then why is kubernetes still telling me that it has insufficient CPU when specifying 0.25m?
Thanks for your help.
I didn't see that the CPU limits were 0.248 according to the picture in my post, so I put cpu: 20m and it worked.
There is a nifty kubectl command to get information about your nodes resources...
kubectl top nodes
And pods...
kubectl top pods
Pods with containers
kubectl top pods --containers=true