kubernetes node shows limit, but no limit set - kubernetes

i have not configured any rangelimit or pod limit
but my nodes show requests and limits, is that a limit? or the max-seen value?
having around 20 active nodes all of them are the same hardware size - but each node shows diffrent limit with kubctl describe node nodeXX
does that mean i cannot use more than the limit?

If you check the result of kubectl describe node nodeXX again more carefully you can see that each pod has the columns: CPU Requests, CPU Limits, Memory Requests and Memory Limits. The total Requests and Limits as shown in your screenshot should be the sum of your pods requests and limits.
If you haven't configured limits for your pods then they will have 0%. However I can see in your screenshot that you have a node-exporter pod on your node. You probably also have pods in the kube-system namespace that you haven't scheduled yourself but are essential for kubernetes to work.
About your question:
does that mean i cannot use more than the limit
This article is great at explaining about requests and limits:
Requests are what the container is guaranteed to get. If a container
requests a resource, Kubernetes will only schedule it on a node that
can give it that resource.
Limits, on the other hand, make sure a container never goes above a
certain value. The container is only allowed to go up to the limit,
and then it is restricted.
For example: if your pod requests 1000Mi of memory and your node only has 500Mi of requested memory left, the pod will never be scheduled. If your pod requests 300Mi and has a limit of 1000Mi it will be scheduled, and kubernetes will try to not allocate more than 1000Mi of memory to it.
It may be OK to surpass 100% limit, specially in development environments, where we trade performance for capacity. Example:

Related

Kubernetes: cpu request and total resources doubts

For better understand my doubts, I will put an example
Example:
We have one worker node with 3 allocatable cpus and kubernetes has scheduled three pods on it:
pod_1 with 500m cpu request
pod_2 with 700m cpu request
pod_3 with 300m cpu request
In this worker node I can't schedule other pods.
But if I check the real usage:
pod_1 cpu usage is 300m
pod_2: cpu usage is 600m
My question is:
Can pod_3 have a real usage of 500m or the request of other pods will limit the cpu usage?
Thanks
Pietro
It doesn't matter what the real usage is - the "request" means how much resources are guaranteed to be available for the pod. Your workload might be using only a fraction of the requested resources - but what will really count is the "request" itself.
Example - Let's say you have a node with 1CPU core.
Pod A - 100m Request
Pod B - 200m Request
Pod C - 700m Request
Now, no pod can be allocated in the node - because the whole 1 CPU resource is already requested by 3 pods. It doesn't really matter which fraction of the allocated resources each pod is using at any given time.
Another point worth noting is the "Limit". A requested resource usage could be surpassed by a workload - but it cannot surpass the "Limit". This is a very important mechanism to be understood.
Kubernetes will schedule the pods based on the request that you configure for the container(s) of pod (via the specs for the respective Deployment or other kinds).
Here's an example:
For simplicity, let's assume only one container for the pod.
containers:
- name: "foobar"
resources:
requests:
cpu: "300m"
memory: "256Mi"
limits:
cpu: "500m"
memory: "512Mi"
If you ask for 300 millicpus as your request, Kubernetes will place the pod on a node that has at least 300 millicpus allocatable to that pod. If a node has less allocatable CPU available, the pod will not be placed on that node. Similarly, you can also set the value for memory request as well.
The limit works to limit the resource use by the container. In the example above, Kubernetes will evict the pod if the container ends up using more than 512MiB of memory; once evicted, the pod will be placed on a node that has at least 300 millicpus available (and if no such node exists, the pod will remain in Pending state with FailedScheduling as the reason, until a node with sufficient capacity is available).
Do note, that the resource request works only at the time of pod scheduling, and not at runtime (meaning, the actual consumption of the resources will not trigger a re-scheduling of the pod even if the container used more resources than what it requested as long as it remains below the limit, if specified).
https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#how-pods-with-resource-requests-are-scheduled
So, in summary,
The total of all your requests is used as the what can be allocated regardless of the actual runtime utilization of your pod (as long as the limit is not crossed)
You can request for 300 millicpus, but only use 100 millicpus, or 400 millicpus; Kubernetes will still show the "allocated" value as 300
If your container crosses the limit, it will get evicted by Kubernetes

AutoScaling work loads without running out of memory

I have a number of pods running and horizontal pod auto scaler assigned to target them, the cluster I am using can also add nodes and remove nodes automatically based on current load.
BUT we recently had the cluster go offline with OOM errors and this caused a disruption in service.
Is there a way to monitor the load on each node and if usage reaches say 80% of the memory on a node, Kubernetes should not schedule more pods on that node but wait for another node to come online.
The pending pods are what one should monitor and define Resource requests which affect scheduling.
The Scheduler uses Resource requests Information when scheduling the pod
to a node. Each node has a certain amount of CPU and memory it can allocate to
pods. When scheduling a pod, the Scheduler will only consider nodes with enough
unallocated resources to meet the pod’s resource requirements. If the amount of
unallocated CPU or memory is less than what the pod requests, Kubernetes will not
schedule the pod to that node, because the node can’t provide the minimum amount
required by the pod. The new Pods will remain in Pending state until new nodes come into the cluster.
Example:
apiVersion: v1
kind: Pod
metadata:
name: requests-pod
spec:
containers:
- image: busybox
command: ["dd", "if=/dev/zero", "of=/dev/null"]
name: main
resources:
requests:
cpu: 200m
memory: 10Mi
When you don’t specify a request for CPU, you’re saying you don’t care how much
CPU time the process running in your container is allotted. In the worst case, it may
not get any CPU time at all (this happens when a heavy demand by other processes
exists on the CPU). Although this may be fine for low-priority batch jobs, which aren’t
time-critical, it obviously isn’t appropriate for containers handling user requests.
Short answer: add resources requests but don't add limits. Otherwise, you will face the throttling issue.

K8s memory request handling for 2 and more pods

I am trying understand memory requests in k8s. I have observed
that when I set memory request for pod, e.g. nginx, equals 1Gi, it actually consume only 1Mi (I have checked it with kubectl top pods). My question. I have 2Gi RAM on node and set
memory requests for pod1 and pod2 equal 1.5Gi, but they actually consume only 1Mi of memory. I start pod1 and it should be started, cause node has 2Gi memory and pod1 requests only 1.5Gi. But what happens If I try to start pod2 after that? Would it be started? I am not sure, cause pod1 consumes only 1Mi of memory but has request for 1.5Gi. Do memory request of pod1 influences on execution of pod2? How k8s will rule this situation?
Memory request is the amount of memory that kubernetes holds for pod. If pod requests some amount of memory, there is a strong guarantee that it will get it. This is why you can't create pod1 with 1.5Gi and pod2 with 1.5Gi request on 2Gi node because if kubernetes would allow it and these pods start using this memory kubernetes won't be able to satisfy the requirements and this is unacceptable.
This is why sum of all pod requests running an specific node cannot exceed this specific node's memory.
"But what happens If I try to start pod2 after that? [...] How k8s
will rule this situation?"
If you have only one node with 2Gi of memory then pod2 won't start. You would see that this pod is in Pending state, waiting for resources. If you have spare resources on different node then kubernetes would schedule pod2 to this node.
Let me know if something is not clear and needs more explanation.
Request is reserved resource for a container, Limit is maximum allowed for the container to use. If you try to start two pods with 1.5Gi on a machine with 2Gi the 2nd one will not start due to the lack of resources it needs to reserve. You need to set requests lower - to the average expected consumption of the pod and some reasonable Limit (max allowed memory). It's better to get familiar with these concepts
In Kubernetes you decide on Pod/Container memory using two parameters:
spec.containers[].resources.requests.memory: Kubernetes scheduler will not schedule your Pod if there is not enough memory, this memory is also reserved for you container
spec.containers[].resources.limits.memory: Container cannot exceed this memory
If you want to be precise about the memory for you container, then you'd better set the same value for both parameters.
This is a very good article explaining by example. And here's the official doc.

What is the default memory allocated for a pod

I am setting up a pod say test-pod on my google kubernetes engine. When I deploy the pod and see in workloads using google console, I am able to see 100m CPU getting allocated to my pod by default, but I am not able to see how much memory my pod has consumed. The memory requested section always shows 0 there. I know we can restrict memory limits and initial allocation in the deployment YAML. But I want to know how much default memory a pod gets allocated when no values are specified through YAML and what is the maximum limit it can avail?
If you have no resource requests on your pod, it can be scheduled anywhere at all, even the busiest node in your cluster, as though you requested 0 memory and 0 CPU. If you have no resource limits and can consume all available memory and CPU on its node.
(If it’s not obvious, realistic resource requests and limits are a best practice!)
You can set limits on individual pods
If not , you can set limits on the overall namespace
Defaults , no limits
But there are some ticks:
Here is a very nice view of this:
https://blog.balthazar-rouberol.com/allocating-unbounded-resources-to-a-kubernetes-pod
When deploying a pod in a Kubernetes cluster, you normally have 2
choices when it comes to resources allotment:
defining CPU/memory resource requests and limits at the pod level
defining default CPU/memory requests and limits at the namespace level
using a LimitRange
From Docker documentation ( assuming u are using docker runtime ):
By default, a container has no resource constraints and can use as
much of a given resource as the host’s kernel scheduler will allow
https://docs.docker.com/v17.09/engine/admin/resource_constraints/
Kubernetes pods' CPU and memory usage can be seen using the metrics-server service and the kubectl top pod command:
$ kubectl top --help
...
Available Commands:
...
pod Display Resource (CPU/Memory/Storage) usage of pods
...
Example in Minikube below:
minikube addons enable metrics-server
# wait 5 minutes for metrics-server to be up and running
$ kubectl top pod -n=kube-system
NAME CPU(cores) MEMORY(bytes)
coredns-fb8b8dccf-6t5k8 6m 10Mi
coredns-fb8b8dccf-sjkvc 5m 10Mi
etcd-minikube 37m 60Mi
kube-addon-manager-minikube 17m 20Mi
kube-apiserver-minikube 55m 201Mi
kube-controller-manager-minikube 30m 46Mi
kube-proxy-bsddk 1m 11Mi
kube-scheduler-minikube 2m 12Mi
metrics-server-77fddcc57b-x2jx6 1m 12Mi
storage-provisioner 0m 15Mi
tiller-deploy-66b7dd976-d8hbk 0m 13Mi
This link has more information.
Kubernetes doesn’t provide default resource limits out-of-the-box. This means that unless you explicitly define limits, your containers can consume unlimited CPU and memory.
More details here: https://medium.com/#reuvenharrison/kubernetes-resource-limits-defaults-and-limitranges-f1eed8655474
The real problem in many of these cases is not that the nodes are too small, but that we have not accurately specified resource limits for the pods.
Resource limits are set on a per-container basis using the resources property of a containerSpec, which is a v1 api object of type ResourceRequirements. Each object specifies both “limits” and “requests” for the types of resources.
If you do not specify a memory limit for a container, one of the following situations applies:
The container has no upper bound on the amount of memory it uses. The container could use all of the memory available on the Node where it is running which in turn could invoke the OOM Killer. Further, in case of an OOM Kill, a container with no resource limits will have a greater chance of being killed.
The container is running in a namespace that has a default memory limit, and the container is automatically assigned the default limit. Cluster administrators can use a LimitRange to specify a default value for the memory limit.
When you set a limit, but not a request, kubernetes defaults the request to the limit. If you think about it from the scheduler’s perspective it makes sense.
It is important to set correct resource requests, setting them too low makes that nodes can get overloaded; too high makes that nodes will stuck idle.
Useful article: memory-limits.
Kubernetes doesn’t provide default resource limits out-of-the-box. This means that unless you explicitly define limits, your containers can consume unlimited CPU and memory.
https://medium.com/#reuvenharrison/kubernetes-resource-limits-defaults-and-limitranges-f1eed8655474

Ensuring availability in Kubernetes with high-variance memory / CPU load?

Problem: the code we're running on Kubernetes Pods have a very high variance across it's runtime; specifically, it has occasional CPU & Memory spikes when certain conditions are triggered. These triggers involve user queries with hard realtime requirements (system has to respond within <5 seconds).
Under conditions where the node serving the spiking pod doesn't have enough CPU/RAM, Kubernetes responds to these excessive requests by killing the pod altogether; which results in no output across any time whatsoever.
In what way can we ensure, that these spikes are being taken into account when pods are allocated; and more critically, that no pod shutdown happens for these reasons?
Thanks!
High availability of pods with load can be achieved in two ways:
Configuring More CPU/Memory
As the applications requires more CPU/memory during the peak times configure in such a way that allocated resources for the POD will take care of extra load. Configure the POD something like this:
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
You can increase the limits based on the usage. But this way of doing can cause two issues
1) Underutilized resources
As the resources are allocated in large number, these may go wasted unless there is a spike in the traffic.
2) Deployment failure
POD deployment may fail because of not having enough resources in the kubernetes node to cater the request.
For more info : https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
> Autoscaling
Ideal way of doing it is to autoscale the POD based on the traffic.
kubectl autoscale deployment <DEPLOY-APP-NAME> --cpu-percent=50 --min=1 --max=10
Configure the cpu-percent based on the requirement, else 80% by default. Min and max are the number of PODS which can be configured accordingly.
So each time a POD hits the CPU percent with 50% a new pod will be launched and continues till it launches a max of 10 PODS and same applicable for vice-versa scenario.
For more info: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/
Limit is a limit, it's expected to do that, period.
What you can do is either run without limit - it will then behave like in any other situation when run on the node - OOM will happen when Node, not Pod reaches memory limit. But this sounds like asking for trouble. And mind that even if you'd set a high limit it's the request that actualy guarantees some resources to pod, so even with limit of 2Gi on Pod it can OOM on 512Mi if request was 128Mi
You should design your app in a way that will not generate such spikes or that will tolerate OOMs on pods. Hard to tell what your soft does exactly, but some things that come to mind that could help cracking this are request throttling, horizontal pod autoscaler or running asynchronously with some kind of message queue.