How is CPU usage calculated in Grafana? - grafana

Here's an image taken from Grafana which shows the CPU usage, requests and limits, as well as throttling, for a pod running in a K8s cluster.
As you can see, the CPU usage of the pod is very low compared to the requests and limits. However, there is around 25% of CPU throttling.
Here are my questions:
How is the CPU usage (yellow) calculated?
How is the ~25% CPU throttling (green) calculated?
How is it possible that the CPU throttles when the allocated resources for the pod are so much higher than the usage?
Extra info:
The yellow is showing container_cpu_usage_seconds_total.
The green is container_cpu_cfs_throttled_periods_total / container_cpu_cfs_periods_total

Related

Kubernetes - Sum of Pods CPU Utilization Limit

We have pods running with same label in multiple namespaces. Cpu utilization of all pods should not cross licensed vCPU,say X-vCPUR
We do not want to limit through SUM(POD cpu Limit) < X-vCPU as we want to give flexibility to apps to burst, as license constraints is on sum of all pod utilization. Limiting through CPU Limit ,reduces number of pods that can be deployed and all pods will not burst at same time.
How can we achieve this? any insights help full, thanks in advance.
Note: There are other applications running with different labels, which does not consider for core usage.
Limiting through CPU limit
Nodes start up ,these apps burst at same time and takes more vCpu,we would like to limit combined pods utilization burst to below a specified limit.

Kubernetes pod resource limiting/quotas as a percentage of host resources (relative) rather than using absolute values?

Resource limiting of containers in pods is typically achieved using something like below -
resources
limits
cpu "600m"
requests
cpu "400m"
As you see, absolute values are used above.
Now,
If the server/host has, say, 1 core then the total CPU computing power of the server is 1,000m. And the container is limited to 600m of computing power, which makes sense.
But, if the server/host has say 16 cores then the total CPU computing power of server is 16,000m. But the container is still restricted to 600m of computing power, which might not make complete sense in every case.
What I instead want is to define limits/requests as a percentage of host resources. Something like below.
resources
limits
cpu "60%"
requests
cpu "40%"
Is this possible in k8s either out-of-box or using any CRD's?

Meaning of ADX Cache utilization more than 100%

We see Cache utilization dashboard for an ADX cluster on Azure portal, but at times I have noticed that this utilization shows up to be more than 100%. I am trying to understand how to interpret it. Say , for example , if cache utilization shows up as 250% , does it mean that 100% of memory cache is utilized and then beyond that 150% disk cache is being utilized?
as explained in the documentation for the Cache Utilization metric:
[this is the] Percentage of allocated cache resources currently in use by the cluster.
Cache is the size of SSD allocated for user activity according to the defined cache policy.
An average cache utilization of 80% or less is a sustainable state for a cluster.
If the average cache utilization is above 80%, the cluster should be scaled up to a storage optimized pricing tier or scaled out to more instances. Alternatively, adapt the cache policy (fewer days in cache).
If cache utilization is over 100%, the size of data to be cached, according to the caching policy, is larger that the total size of cache on the cluster.
Utilization > 100% means that there's not enough room in the (SSD) cache to hold all the data that the policy indicates should be cached. If auto-scale is enabled then the cluster will be scaled-out as a result.
The cache applies an LRU eviction policy, so that even when utilization exceeds 100% query performance will be as good as possible (though, of course, if queries constantly reference data more than what the cache can hold some performance degradation will be observed.)

What Kubernetes uses to calculate the CPU ratio, request or limit?

When you specify and Horizontal Pod Autoscaler in Kubernetes for example with targetCPUUtilizationPercentage of 50, what does Kubernetes use to calculate the CPU ratio, the request or the limit of the container?
So for example, with a request=250 and limit=500 and you want to scale up when is half its limit:
If it used the request, I would put the target to 100% at least as it can raise to 200%.
If it used the limit, I would use target = 50% as 100% would mean the limit is reached.
targetCPUUtilizationPercentage of 50 means that if average CPU utilization across all Pods goes up above 50% then HPA would scale up the deployment and if the average CPU utilization across all Pods goes below 50% then HPA would scale down the deployment if the number of replicas are more than 1
I just checked the code and found that targetUtilization percentage calculation uses resource request.
refer below code
currentUtilization = int32((metricsTotal * 100) / requestsTotal)
here is the link
https://github.com/kubernetes/kubernetes/blob/v1.9.0/pkg/controller/podautoscaler/metrics/utilization.go#L49

Query on kubernetes metrics-server metrics values

I am using metrics-server(https://github.com/kubernetes-incubator/metrics-server/) to collect the core metrics from containers in a kubernetes cluster.
I could fetch 2 resource usage metrics per container.
cpu usage
memory usage
However its not clear to me whether
these metrics are accumulated over time or they are already sampled for a particular time window(1 minute/ 30 seconds..)
What are the units for the above metric values. For CPU usage, is it the number of cores or milliseconds? And for memory usage i assume its the bytes usage.
While computing CPU usage metric value, does metrics-server already take care of dividing the container usage by the host system usage?
Also, if i have to compare these metrics with the docker-api metrics, how to compute CPU usage % for a given container?
Thanks!
Metrics are scraped periodically from kubelets. The default resolution duration is 60s, which can be overriden with the --metric-resolution=<duration> flag.
The value and unit (cpu - cores in decimal SI, memory - bytes in binary SI) are arrived at by using the Quantity serializer in the k8s apimachinery package. You can read about it from the comments in the source code
No, the CPU metric is not relative to the host system usage as you can see that it's not a percentage value. It represents the rate of change of the total amount of CPU seconds consumed by the container by core. If this value increases by 1 within one second, the pod consumes 1 CPU core (or 1000 milli-cores) in that second.
To arrive at a relative value, depending on your use case, you can divide the CPU metric for a pod by that for the node, since metrics-server exposes both /pods and /nodes endpoints.