How many cores do kubernetes pods use when it's CPU usage is limited by policy? - kubernetes

Kubernetes allows to limit pod resource usage.
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 200m # which is 20% of 1 core
memory: 256Mi
Let's say my kubernetes node has 2 core. And I run this pod with limit of CPU: 200m on this node. In this case, will my pod use it's underlying node's 1Core's 200m or 2Core's 100m+100m?
This calculation is needed for my gunicorn worker's number formula, or nginx worker's number etc..
In gunicorn documentation it says
Generally we recommend (2 x $num_cores) + 1 as the number of workers
to start off with.
So should I use 5 workers? (my node has 2 cores). Or it doesn't even matter since my pod has only allocated 200m cpu and I should consider my pod has 1 core?
TLDR: How many cores do pods use when its cpu usage is limited by kubernetes? If I run top inside pod, I'm seeing 2 cores available. But I'm not sure my application is using this 2 core's 10%+10% or 1core's 20%..

It will limit to 20% of one core, i.e. 200m. Also, limit means a pod can touch a maximum of that much CPU and no more. So pod CPU utilization will not always touch the limit.
Total CPU limit of a cluster is the total amount of cores used by all nodes present in cluster.
If you have a 2 node cluster and the first node has 2 cores and second node has 1 core, K8s CPU capacity will be 3 cores (2 core + 1 core). If you have a pod which requests 1.5 cores, then it will not be scheduled to the second node, as that node has a capacity of only 1 core. It will instead be scheduled to first node, since it has 2 cores.

CPU is measured in units called millicores. Each node in the cluster introspects the operating system to determine the amount of CPU cores on the node and then multiples that value by 1000 to express its total capacity. For example, if a node has 2 cores, the node’s CPU capacity would be represented as 2000m. If you wanted to use a 1/10 of a single core, you would represent that as 100m.
So, if in your cluster you provided 200m milicores, then it will stick to one core and take up the 20 percent of that core. Now if you provided another pod with 1.5m, then only it will take up more than one core.

Related

Kubernetes limit nvdia GPU

I have a node with 2 gpu on it, and I deploy two containers each with 1 gpu limit
resources:
limits:
nvidia.com/gpu: 1
It works well with nvidia official container . However with my container I can see 2 gpu inside each container
So the resources limits only works during scheduling which I can only deploy two pods with gpu limits 1.
I am expecting inside the container it can only use 1 gpu which matches the resource limit.
Is it sth to do with the container? I thought it should be controlled on the kubernetes level?
Any suggestions?
Regards
David

kubernetes pod resource cpu on nodes with different cpu cores count

This is a bit crazy, but we run a kubernetes cluster with 4 nodes (w/ Docker as container engine):
node01/node02: 8 cores
node03/node04: 4 cores
I am confusing about exactly what pod resource request cpu give as real cpu for a containerized application.
In my understanding, pods from a deployment that request 1 CPU, will all have the same cpu shares, so this mean a container will run faster on node01/node02 than 03/04 ?
Not necessarily:
If the application is single-threaded, it will run at the same speed no matter how many cores the system it's on has.
If the application is disk- or database-bound, adding more cores won't make it go faster.
If other pods (or non-Kubernetes processes) are running on either of the nodes, those share the CPU resource, and a busy 8-core system could in practice be slower than an idle 4-core system.
If the pod spec has resource requests, it could be prevented from running on the smaller system
resources:
requests:
cpu: 6 # can't run on the 4-core system
If the pod spec has resource limits, that can prevent it from using all of the cores, even if it's scheduled on the larger system
resources:
limits:
cpu: 3 # even if it's scheduled on the 8-core system

Does Kubernetes PODs provide memory back, after acquiring more than the requested amount

I am trying to understand the behavior of K8S POD memory allocation and so far no luck on the materials I read on the internet.
My question is, If I have a POD template defined with the below values for the memory
Limits:
cpu: 2
memory: 8Gi
Requests:
cpu: 500m
memory: 2Gi
And say my application suddenly requires more memory and the POD allocates 4Gi ( from 2Gi initial memory ) to get the task done. Would the POD give back the extra 2Gi it acquired back to the underlying OS and become a 2Gi POD again after the task is complete or would it function as a POD with 4Gi memory afterward.
My application is a Java application running on Apache Tomcat having the max heap defined for 6Gi.
The Kubernetes resource requests come into effect at basically three times:
When new pods are being initially scheduled, the resource requests (only) are used to find a node with enough space. The sum of requests must be less than the physical size of the node. Limits and actual utilization aren't considered.
If the process allocates memory, and this would bring its total utilization above the pod's limit, the allocation will fail.
If the node runs out of memory, Kubernetes will look through the pods on that node and evict the pods whose actual usage most exceeds their requests.
Say you have a node with 16 GiB of memory. You run this specific pod in a Deployment with replicas: 8; they would all fit on the node, and for the sake of argument let's say Kubernetes puts them all there. Regardless of what the pods are doing, a 9th pod wouldn't fit on the node because the memory requests would exceed the physical memory.
If your pod goes ahead and allocates a total of 4 GiB of memory, that's fine so long as the physical system has the memory for it. If the node runs out of memory, though, Kubernetes will see this pod has used 2 GiB more than its request; that could result in the pod getting evicted (destroyed and recreated, probably on a different node).
If the process did return the memory back to the OS, that would show up in the "actual utilization" part of the metric; since its usage would now be less than its requests, it would be in less danger of getting evicted if the node did run out of memory. (Many garbage-collected systems will hold on to OS memory as long as they can and reuse it, though; see e.g. Does GC release back memory to OS?.)

Kubernetes: cpu request and total resources doubts

For better understand my doubts, I will put an example
Example:
We have one worker node with 3 allocatable cpus and kubernetes has scheduled three pods on it:
pod_1 with 500m cpu request
pod_2 with 700m cpu request
pod_3 with 300m cpu request
In this worker node I can't schedule other pods.
But if I check the real usage:
pod_1 cpu usage is 300m
pod_2: cpu usage is 600m
My question is:
Can pod_3 have a real usage of 500m or the request of other pods will limit the cpu usage?
Thanks
Pietro
It doesn't matter what the real usage is - the "request" means how much resources are guaranteed to be available for the pod. Your workload might be using only a fraction of the requested resources - but what will really count is the "request" itself.
Example - Let's say you have a node with 1CPU core.
Pod A - 100m Request
Pod B - 200m Request
Pod C - 700m Request
Now, no pod can be allocated in the node - because the whole 1 CPU resource is already requested by 3 pods. It doesn't really matter which fraction of the allocated resources each pod is using at any given time.
Another point worth noting is the "Limit". A requested resource usage could be surpassed by a workload - but it cannot surpass the "Limit". This is a very important mechanism to be understood.
Kubernetes will schedule the pods based on the request that you configure for the container(s) of pod (via the specs for the respective Deployment or other kinds).
Here's an example:
For simplicity, let's assume only one container for the pod.
containers:
- name: "foobar"
resources:
requests:
cpu: "300m"
memory: "256Mi"
limits:
cpu: "500m"
memory: "512Mi"
If you ask for 300 millicpus as your request, Kubernetes will place the pod on a node that has at least 300 millicpus allocatable to that pod. If a node has less allocatable CPU available, the pod will not be placed on that node. Similarly, you can also set the value for memory request as well.
The limit works to limit the resource use by the container. In the example above, Kubernetes will evict the pod if the container ends up using more than 512MiB of memory; once evicted, the pod will be placed on a node that has at least 300 millicpus available (and if no such node exists, the pod will remain in Pending state with FailedScheduling as the reason, until a node with sufficient capacity is available).
Do note, that the resource request works only at the time of pod scheduling, and not at runtime (meaning, the actual consumption of the resources will not trigger a re-scheduling of the pod even if the container used more resources than what it requested as long as it remains below the limit, if specified).
https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#how-pods-with-resource-requests-are-scheduled
So, in summary,
The total of all your requests is used as the what can be allocated regardless of the actual runtime utilization of your pod (as long as the limit is not crossed)
You can request for 300 millicpus, but only use 100 millicpus, or 400 millicpus; Kubernetes will still show the "allocated" value as 300
If your container crosses the limit, it will get evicted by Kubernetes

Kubernetes Node Memory Limits

I'm a beginner on Kubernetes. When I described my node, I saw this at the very bottom:
kubectl describe node ip-x-x-x-x.ap-southeast-2.compute.internal
...
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
225m (11%) 200m (10%) 125Mi (1%) 300Mi (3%)
Events: <none>
How do I un-limit the memory + cpu?
Every node has limits according to its resources: number of processors or cores, amount of memory. Kubernetes uses this information for distributing Pods across Nodes.
Referring to the official documentation:
- Meaning of CPU
One CPU, in Kubernetes, is equivalent to:
- 1 AWS vCPU
- 1 GCP Core
- 1 Azure vCore
- 1 Hyperthread on a bare-metal Intel processor with Hyperthreading
1 CPU is 1000m (1 thousand milicores)
- Meaning of memory. It is an amount of memory on the server.
In the output from kubectl describe node <node-name> command, you see statistics of resource usage. Actually, resources of your server can be counted from the example in question, it is 2 CPUs/Cores and around 10000 MB of memory.
Please note that some resources are already allocated by system Pods like kube-dns, kube-proxy or kubernetes-dashboard.