How to determine kubernetes pod ephemeral storage request and limit? - kubernetes

My service running in a pod output too much log and cause low ephemeral storage. As a result, the pod is evicted and other services can't deploy to k8s.
So how I can determine pod resource ephemeral storage requests and limit to avoid this situation? I can't find any best practice about ephemeral storage.

Note that by default, if you have not set any limits on ephemeral-storage the pod has access to the entire disk of the node it is running on, so if you are certain that the pod is being evicted because of this, then you are certain that the pod consumed it all. You can check this from kubelet logs, as kubelet is the guy in charge of detecting this behavior and evicting the pod.
Now, here you have two options. Either you can set an ephemeral-storage limit, and make a controlled pod eviction, or just get an external volume, map it into the container, and get the logs outside of the node.
You can also monitor the disk usage, as suggesting shubham_asati, but if it is eating it all, it is eating it all. You are just going to look at how it is getting filled out.

I guess ephemeral storage for a pod can be defined as cpu request/limit.
See this https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#local-ephemeral-storage but this feature is in beta stage K8's version 1.16.
To check namespace level resource consumption view https://kubernetes.io/docs/concepts/policy/resource-quotas/#storage-resource-quota.
You can set request/limit ephemeral storage for each pod .
Regarding your issue
check namespace quotas for ephemeral storage using kubectl
describe namespace
try du -sh / inside a container.
then compare the storages from both outputs.

You need to deploy prometheus and grafana to find out how much memory and cpu are getting consumed by the pod. Then accordingly set those request and limits on that pod
Requests and limits setting for ephemeral storage is a new feature and is still in beta.You might have to wait few more months to use that feature.
However, if you are on k8s 1.18 then you can test Requests and limits setting for ephemeral storage

Related

Kubernetes: Scheduling Pod without resource limits

Kubernetes: what happens when a pod has no resources limits / requests defined?
How much resources can a pod use in Kubernetes (GKE) when it has no (or only partial) resource limits/requests defined?
For example, I have a pod with only memory limits and memory requests, but it has no cpu specs.
Will the cpu available to this pod be:
0
as much as left on the node/namespace (total minus all other pod claims)
as much as possible regarding actual use by other pods on the node/namespace
If you do not specify a CPU limit for a Container, then one of these situations applied:
The Container has no upper bound limit on the CPU resources it can use. The Container can use all of the CPU resources available on the Node where the pod is running. So in your case it will be second option which you have specified in your question : as much as left on the node/namespace.
Normally Kubernetes Cluster Administrator defines the limit for each and every namespace in cluster. so the Container is running in a namespace that has a default CPU limit, and the Container is automatically assigned the default limit.
Resource Quota should be defined for each Namespace which comes in handy to get rid of pods that don't have resource request or limits and eating up all the resources. This means you can not schedule the pod until you specify the resource requirements for that pod in particular namespace and this is recommended as best practices
For more information you could refer to this section : https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource/#if-you-do-not-specify-a-cpu-limit

Kubernetes release requested cpu

We have a Java application distributed over multiple pods on Google Cloud Platform. We also set memory requests to give the pod a certain part of the memory available on the node for heap and non-heap space.
The application is very resource-intensive in terms of CPU while starting the pod but does not use the CPU after the pod is ready (only 0,5% are used). If we use container resource "requests", the pod does not release these resources after start has finished.
Does Kubernetes allow to specify that a pod is allowed to use (nearly) all the cpu power available during start and release those resources after that? Due to rolling update we can prevent that two pods are started at the same time.
Thanks for your help.
If you specify requests without a limit the value will be used for scheduling the pod to an appropriate node that satisfies the requested available CPU bandwidth. The kernel scheduler will assume that the requests match the actual resource consumption but will not prevent exceeding usage. This will be 'stolen' from other containers.
If you specify a limit as well your container will get throttled if it tries to exceed the value. You can combine both to allow bursting usage of the cpu, exceeding the usual requests but not allocating everything from the node, slowing down other processes.
"Does Kubernetes allow to specify that a pod is allowed to use
(nearly) all the cpu power available during start and release those
resources after that?"
A key word here is "available". The answer is "yes" and it can be achieved by using Burstable QoS (Quality of Service) class. Configure CPU request to a value you expect the container will need after starting up, and either:
configure CPU limit higher than the CPU request, or
don't configure CPU limit in which case either namespace's default CPU limit will apply if defined, or the container "...could use all of the CPU resources available on the Node where it is running".
If there isn't CPU available on the Node for bursting, the container won't get any beyond the requested value and as result the starting of the application could be slower.
It is worth mentioning what the docs explain for Pods with multiple Containers:
The CPU request for a Pod is the sum of the CPU requests for all the
Containers in the Pod. Likewise, the CPU limit for a Pod is the sum of
the CPU limits for all the Containers in the Pod.
If running Kubernetes v1.12+ and have access to configure kubelet, the Node CPU Management Policies could be of interest.
one factor for scheduling pods in nodes is resource availability and kubernetes scheduler calculates used resources from request value of each pod. If you do not assign any value in request parameter then for this deployment request will be zero . Request parameter doesnt ensure that the pod will use this much cpu or ram. you can get current usage of resources from "kubectl top pods / nodes".
request parameter will buffer resources for a pod. where as limit put a cap on resources usage for a pod.
you can get more information here https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/.
This will give you a rough idea of request and limit.

Is Kubernetes ephemeral storage elastic?

In Kubernetes pods have requests and limit for ephemeral storage. But it isn't clear whether this is an elastic resource i.e. if you save a file inside a K8S pod and then delete it, does the ephemeral storage usage go up and then back down again? Or once you have consumed any ephemeral storage, does that usage count towards the overall usage for the lifetime of the pod?
As you mentioned, Kubernetes Ephemeral storage can be managed by implementing requests and limits for the containers inside a Pod, that is quite well described in the official k8s documentation.
Although, on Pod level local storage consists of a sum of entire container consumers, the main capacitor here is Node where the Pod actually resides on. Node level represents a mechanism of Allocation compute resources between a Pods which are spawned on this Node as per K8s scheduler Decision. The way how the compute resources are distributed between underlying Pods on the particular Node is nicely described in this Example.
Generally, when the Pod successfully assigned to a Node it obtains emptyDir as a part of local ephemeral storage volume as well as container logs, image layers and container writable layers; furthermore container disaster does not affect parent Pod emptyDir data.
Basically, Ephemeral storage is persistent during life of the Pod, although it counts resources shared by all containers in the Pod, the most common usage would be for storing temporary or caching data.

Kubernetes cluster seems to be unstable

Recently we've experienced issues with both Non-Production and Production clusters where the nodes encountered 'System OOM encountered' issue.
The nodes within the Non-Production cluster don't seem to be sharing the pods. It seems like a given node is running all the pods and putting a load on the system.
Also, the Pods are stuck in this status: 'Waiting: ContainerCreating'.
Any help/guidance with the above issues would be greatly appreciated. We are building more and more services in this cluster and want to make sure there's no instability and/or environment issues and place proper checks/configuration in place before we go live.
"I would recommend you manage container compute resources properly within your Kubernetes cluster. When creating a Pod, you can optionally specify how much CPU and memory (RAM) each Container needs to avoid OOM situations.
When Containers have resource requests specified, the scheduler can make better decisions about which nodes to place Pods on. And when Containers have their limits specified, contention for resources on a node can be handled in a specified manner. CPU specifications are in units of cores, and memory is specified in units of bytes.
An event is produced each time the scheduler fails, use the command below to see the status of events:
$ kubectl describe pod <pod-name>| grep Events
Also, read the official Kubernetes guide on “Configure Out Of Resource Handling”. Always make sure to:
reserve 10-20% of memory capacity for system daemons like kubelet and OS kernel
identify pods which can be evicted at 90-95% memory utilization to reduce thrashing and incidence of system OOM.
To facilitate this kind of scenario, the kubelet would be launched with options like below:
--eviction-hard=memory.available<xMi
--system-reserved=memory=yGi
Replacing x and y with actual memory values.
Having Heapster container monitoring in place should be helpful for visualization".
Read more reading on Kubernetes and Docker Administration
Unable to mount volumes for pod
"xxx-3615518044-6l1cf_xxx-qa(8a5d9893-230b-11e8-a943-000d3a35d8f4)":
timeout expired waiting for volumes to attach/mount for pod
"xxx-service-3615518044-6l1cf"/"xxx-qa"
That indicates your pod is having trouble mounting the volume specified in your configuration. This can often be a permissions issue. If you post your config files (like to a gist) with private info removed, we could probably be more helpful.

Why does a single node cluster only have a small percentage of the cpu quota available?

pod will not start due to "No nodes are available that match all of the following predicates:: Insufficient cpu"
In the above question, I had an issue starting a deployment with 3 containers.
Upon further investigation, it appears there is only 27% of the CPU quota available - which seems very low. The rest of the CPU seems to be assigned to some default bundled containers.
How is this normally mitigated? Is a larger node required? Do limits need to be set manually? Are all those additional containers necessary?
1 cpu for a single node cluster is probably too small.
From the containers in the original answer, both the dashboard and fluentd can be removed:
the dashboard is just a web UI, which can go away if you use kubectl (which you should, IMO);
fluentd should be reading the log files on disk to ship them somewhere (GCP's log aggregation, I think).
The unnecessary containers should be tied to a Deployment or ReplicaSet, which can be listed with kubectl get deployment and kubectl get rs, respectively. You can then kubectl delete them.
Increasing the resources on the node should not change the requirements for the basic pods, meaning they should all be free scheduling.