Pod don't run, insufficient resources - kubernetes

I get this error when i tried to deploy one deployment with ten replicas.
0/2 nodes are available: 1 Insufficient memory, 1 node(s) had taints that the pod didn't tolerate.
I don't understand why two node. Is the same node and just the same problem.
I have a lot of RAM (1GB) free.
How can i fix this error with out add another node.
I have in deployment yaml file this for resources:
limits:
cpu: 1000m
memory: 1000Mi
requests:
cpu: 100m
memory: 200Mi
Server:
Master:
CPU: 2
RAM: 2 - 1 Free
Slave:
CPU: 2
RAM: 2 - 1 Free

I think you have multiple issues here.
First to the format of the error message you get
0/2 nodes are available: 1 Insufficient memory, 1 node(s) had taints that the pod didn't tolerate.
The first thing is clear you have 2 nodes in total an could not schedule to any of them. Then comes a list of conditions which prevent the scheduling on that node. One node can be affected by multiple issues. For example, low memory and insufficient CPU. So, the numbers can add up to more than what you have on total nodes.
The second issue is that the requests you write into your YAML file apply per replica. If you instantiate the same pod with 100M Memory 5 times they need 500M in total. You want to run 10 pods which request each 200Mi memory. So, you need 2000Mi free memory.
Your error message already implies that there is not enough memory on one node. I would recommend you inspect both nodes via kubectl describe node <node-name> to find out how much free memory Kubernetes "sees" there. Kubernetes always blocks the full amount of memory a pod requests regardless how much this pod uses.
The taints in your error message tells that the other node, possibly the master, has a taint which is not tolerated by the deployment. For more about taints and tolerations see the documentation. In short find out which taint on the node prevents the scheduling and remove it via kubectl taint nodes <node-name> <taint-name>-.

Related

Kubernetes ephemeral-storage of containers

Kubernetes has the concept of ephemeral-storage which can be applied by the deployment to a container like this:
limits:
  cpu: 500m
  memory: 512Mi
  ephemeral-storage: 100Mi
requests:
  cpu: 50m
  memory: 256Mi
  ephemeral-storage: 50Mi
Now, when applying this to a k8s 1.18 cluster (IBM Cloud managed k8s), I cannot see any changes when I look at a running container:
kubectl exec -it <pod> -n <namespace> -c nginx -- /bin/df
I would expect to see there changes. Am I wrong?
You can see the allocated resources by using kubectl describe node <insert-node-name-here> on the node that is running the pod of the deployment.
You should see something like this:
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 1130m (59%) 3750m (197%)
memory 4836Mi (90%) 7988Mi (148%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
attachable-volumes-azure-disk 0 0
When you requested 50Mi of ephemeral-storage it should show up under Requests.
When your pod tries to use more than the limit (100Mi) the pod will be evicted and restarted.
On the node side, any pod that uses more than its requested resources is subject to eviction when the node runs out of resources. In other words, Kubernetes never provides any guarantees of availability of resources beyond a Pod's requests.
In kubernetes documentation you can find more details how Ephemeral storage consumption management works here.
Note that using kubectl exec with df command might not show actual use of storage.
According to kubernetes documentation:
The kubelet can measure how much local storage it is using. It does this provided that:
the LocalStorageCapacityIsolation feature gate is enabled (the feature is on by default), and
you have set up the node using one of the supported configurations for local ephemeral storage.
If you have a different configuration, then the kubelet does not apply resource limits for ephemeral local storage.
Note: The kubelet tracks tmpfs emptyDir volumes as container memory use, rather than as local ephemeral storage.

Pod in status pending but autoscale is enabled, why doesn't work?

Did you find this behavior before?
I have a GKE cluster with 5 nodes, I have autoscaling enable as you can see below
autoscaling:
enabled: true
maxNodeCount: 9
minNodeCount: 1
config:
diskSizeGb: 100
diskType: pd-standard
imageType: COS
machineType: n1-standard-1
oauthScopes:
- https://www.googleapis.com/auth/devstorage.read_only
- https://www.googleapis.com/auth/logging.write
- https://www.googleapis.com/auth/monitoring
- https://www.googleapis.com/auth/servicecontrol
- https://www.googleapis.com/auth/service.management.readonly
- https://www.googleapis.com/auth/trace.append
serviceAccount: default
initialNodeCount: 1
instanceGroupUrls:
- xxx
management:
autoRepair: true
autoUpgrade: true
name: default-pool
podIpv4CidrSize: 24
selfLink: xxxx
status: RUNNING
version: 1.13.7-gke.8
However when I'm trying to deploy one service I receive this error
Warning FailedScheduling 106s default-scheduler 0/5 nodes are available: 3 Insufficient cpu, 4 node(s) didn't match node selector.
Warning FailedScheduling 30s (x3 over 106s) default-scheduler 0/5 nodes are available: 4 node(s) didn't match node selector, 5 Insufficient cpu.
Normal NotTriggerScaleUp 0s (x11 over 104s) cluster-autoscaler pod didn't trigger scale-up (it wouldn't fit if a new node is added): 1 node(s) didn't match node selector
And if I see the stats of my resources I don't see problem with CPU, right?
kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
gke-pre-cluster-1-default-pool-17d2178b-4g9f 106m 11% 1871Mi 70%
gke-pre-cluster-1-default-pool-17d2178b-g8l1 209m 22% 3042Mi 115%
gke-pre-cluster-1-default-pool-17d2178b-grvg 167m 17% 2661Mi 100%
gke-pre-cluster-1-default-pool-17d2178b-l9gt 122m 12% 2564Mi 97%
gke-pre-cluster-1-default-pool-17d2178b-ppfw 159m 16% 2830Mi 107%
So... if the problem seems is not cpu with this message?
And the other thing is... why if there are a problem with resources don't scale up automatically?
Please anyone found this before can explain me? I don't understand.
Thank you so much
GKE's autoscaling functionality is based on Compute Engine instance groups. As such, it only pays attention to actual, dynamic resource usage (CPU, memory, etc), and not to the requests section in a Kubernetes pod template.
A GKE node that has 100% of its resources allocated (and therefore cannot schedule any more pods) is considered idle by the autoscaler if the software running in those pods isn't actually using the resources. If the software running in those pods is waiting for a "Pending" pod to start, then your workload is deadlocked.
Unfortunately, I know of no solution to this problem. If you control the pod templates that are being used to start the pods, you can try asking for less memory/CPU than your jobs actually need. But that might result in pods getting evicted.
GKE's autoscaler isn't particularly smart.
Could you check if you have this entry "ZONE_RESOURCE_POOL_EXHAUSTED" in StackDriver logging?
It probably that zone you are using with your kubernetes cluster is with problems.
Regards.

Cannot create a deployment that requests more than 2Gi memory

My deployment pod was evicted due to memory consumption:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Evicted 1h kubelet, gke-XXX-default-pool-XXX The node was low on resource: memory. Container my-container was using 1700040Ki, which exceeds its request of 0.
Normal Killing 1h kubelet, gke-XXX-default-pool-XXX Killing container with id docker://my-container:Need to kill Pod
I tried to grant it more memory by adding the following to my deployment yaml:
apiVersion: apps/v1
kind: Deployment
...
spec:
...
template:
...
spec:
...
containers:
- name: my-container
image: my-container:latest
...
resources:
requests:
memory: "3Gi"
However, it failed to deploy:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 4s (x5 over 13s) default-scheduler 0/3 nodes are available: 3 Insufficient memory.
Normal NotTriggerScaleUp 0s cluster-autoscaler pod didn't trigger scale-up (it wouldn't fit if a new node is added)
The deployment requests only one container.
I'm using GKE with autoscaling, the nodes in the default (and only) pool have 3.75 GB memory.
From trial and error, I found that the maximum memory I can request is "2Gi". Why can't I utilize the full 3.75 of a node with a single pod? Do I need nodes with bigger memory capacity?
Even though the node has 3.75 GB of total memory, is very likely that the capacity allocatable is not all 3.75 GB.
Kubernetes reserve some capacity for the system services to avoid containers consuming too much resources in the node affecting the operation of systems services .
From the docs:
Kubernetes nodes can be scheduled to Capacity. Pods can consume all the available capacity on a node by default. This is an issue because nodes typically run quite a few system daemons that power the OS and Kubernetes itself. Unless resources are set aside for these system daemons, pods and system daemons compete for resources and lead to resource starvation issues on the node.
Because you are using GKE, is they don't use the defaults, running the following command will show how much allocatable resource you have in the node:
kubectl describe node [NODE_NAME] | grep Allocatable -B 4 -A 3
From the GKE docs:
Allocatable resources are calculated in the following way:
Allocatable = Capacity - Reserved - Eviction Threshold
For memory resources, GKE reserves the following:
25% of the first 4GB of memory
20% of the next 4GB of memory (up to 8GB)
10% of the next 8GB of memory (up to 16GB)
6% of the next 112GB of memory (up to 128GB)
2% of any memory above 128GB
GKE reserves an additional 100 MiB memory on each node for kubelet eviction.
As the error message suggests, scaling the cluster will not solve the problem because each node capacity is limited to X amount of memory and the POD need more than that.
Each node will reserve some memory for Kubernetes system workloads (such as kube-dns, and also for any add-ons you select). That means you will not be able to access all the node's 3.75 Gi memory.
So to request that a pod has a 3Gi memory reserved for it, you will indeed need nodes with bigger memory capacity.

Pod's status is always ContainerCreating. . Events show 'Failed create pod sandbox'

I am trying to create a deployment on a K8s cluster with one master and two worker nodes. The cluster is running on 3 AWS EC2 instances. I have been using this environment for quite sometime to play with Kubernetes. Three days back, I have started to see all the pods status to change to ContainerCreating from Running. Only the pods that are scheduled on master are shown as Running. The pods running on worker nodes are shown as ContainerCreating. When I run kubectl describe pod <podname>, it shows in the event the following
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 34s default-scheduler Successfully assigned nginx-8586cf59-5h2dp to ip-172-31-20-57
Normal SuccessfulMountVolume 34s kubelet, ip-172-31-20-57 MountVolume.SetUp succeeded for volume "default-token-wz7rs"
Warning FailedCreatePodSandBox 4s kubelet, ip-172-31-20-57 Failed create pod sandbox.
Normal SandboxChanged 3s kubelet, ip-172-31-20-57 Pod sandbox changed, it will be killed and re-created.
This error has been bugging me now. I tried to search around online on related error but I couldn't get anything specific. I did kubeadm reset on the cluster including master and worker nodes and brought up the cluster again. The nodes status shows ready. But I run into the same problem again whenever I try to create a deployment using the below command for example:
kubectl run nginx --image=nginx --replicas=2
This can occur if you specify a limit or request on memory and use the wrong unit.
Below triggered the message:
resources:
limits:
cpu: "300m"
memory: "256m"
requests:
cpu: "50m"
memory: "64m"
The correct line would be:
resources:
limits:
cpu: "300m"
memory: "256Mi"
requests:
cpu: "50m"
memory: "64Mi"
It might someone else, but I've spent a weekend on this until I noticed I had requested 1000 mem, insted of 1000Mi...
I run k8s on a few DO droplets and was stuck on this very issue. No other info was given - just the FailedCreatePodSandBox complaining about a file I had never seen before.
Spent a lotta time trying to figure it out - the only thing that fixed the issue for me was restarting my master and each node in their entirety. That got things going instantly.
sudo shutdown -r now

When a pod can't be scheduled, what does the 3 in "Insufficient cpu (3)" refer to?

When I create a Pod that cannot be scheduled because there are no nodes with sufficient CPU to meet the Pod's CPU request, the events output from kubectl describe pod/... contain a message like No nodes are available that match all of the following predicates:: Insufficient cpu (3).
What does the (3) in Insufficient cpu (3) mean?
For example, if I try to create a pod that requests 24 CPU when all of my nodes only have 4 CPUs:
$ kubectl describe pod/large-cpu-request
Name: large-cpu-request
Namespace: default
Node: /
Labels: <none>
Annotations: <none>
Status: Pending
IP:
Controllers: <none>
Containers:
cpuhog:
...
Requests:
cpu: 24
...
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
23m 30s 84 default-scheduler Warning FailedScheduling No nodes are available that match all of the following predicates:: Insufficient cpu (3).
At other times I have seen event messages like No nodes are available that match all of the following predicates:: Insufficient cpu (2), PodToleratesNodeTaints (1) when a pod's resource requests were too high, so the 3 does not seem like a constant number - nor does it seem related to my 24 CPU request either.
It means that your Pod doesn't fit on 3 nodes because of Insufficient CPU and 1 node because of taints (likely the master).
A pod can't be scheduled when it requests more cpu than you have in your cluster. For example, if you have 8 Kubernetes CPU (see this page to calculate how many kubernetes cpu you have) in total and if your existing pods have already consumed that much cpu then you can't schedule more pods unless some of your existing pods are killed by the time you request to schedule a new pod. Here is a simple equation can be followed in Horizontal Pod Autoscaler (HPA):
RESOURCE REQUEST CPU * HPA MAX PODS <= Total Kubernetes CPU
You can always tune up these numbers. In my case, I adjusted my manifest file for the RESOURCE REQUEST CPU. It can be 200m, or 1000m (= 1 kubernetes cpu).