Kubernetes CPU in nanoseconds - kubernetes

I am trying to interpret the metrics fetched using Telegraf with the Kubernetes plugins on the k3s cluster. I see the results are reporting the CPU in terms of nanoseconds and memory and disk in terms of bytes. More importantly, I like to understand how the CPU usage shows in ns can be converted into %?
Below is one such example capture:
kubernetes_pod_container,container_name=telegraf-ds,host=ah-ifc2,namespace=default,node_name=ah-ifc2,pod_name=telegraf-ds-dxdhz, rootfs_available_bytes=73470144512i,logsfs_available_bytes=0i,logsfs_capacity_bytes=0i,cpu_usage_nanocores=243143i,memory_usage_bytes=0i,memory_working_set_bytes=25997312i,memory_major_page_faults=0i,rootfs_used_bytes=95850790i,logsfs_used_bytes=4096i,cpu_usage_core_nanoseconds=4301919390i,memory_rss_bytes=0i,memory_page_faults=0i,rootfs_capacity_bytes=196569534464i 1616950920000000000
Also, how any visualization tool such as Chronograf/Grafana converts these raw data in a more actionable format such as cpu%, memory/disk utilization%?
Thanks and any advice will help.

If you have a running total of the number of (nano)seconds, you can look at the derivative to figure out percentages.
Example:
At time 00:00:00 the cpu usage counter was at 1,000,000,000ns
At time 00:00:10 the cpu usage counter was at 3,000,000,000ns
From this information we can conclude that during the 10 seconds between 00:00:00 and 00:00:10 the process used the cpu for 3,000,000,00 - 1,000,000,000 = 2,000,000,000 nanoseconds.
In other words, it used the cpu for 2 seconds out of 10, giving us a cpu usage of 20%.

Related

Understand CPU utilisation with image preprocessing applications

I'm trying to understand how to compute the CPU utilisation for audio and video use cases.
In real time audio applications, this is what I typically do:
if an application takes 4ms to process 28ms of audio data, I say that the CPU utilisation is 14.28% (4/28).
How should this be done for applications like resize/crop? let's say I'm resizing an image from 162*122 to 128*128 size image at 1FPS, and it takes 11ms.. What would be the CPU utilisation?
CPU utilization is quite complicated, and strongly depends on stuff like:
The CPU itself
The algorithms utilized for the task
Other tasks running alongside the CPU
CPU utilization is also strongly related to the process scheduling of your PC, hence the operating system used, so most operating systems will expose some kind of API for CPU utilization diagnostics, but such API is highly platform-dependent.
But how does CPU utilization calculations work anyway?
The most simple way in which CPU utilization is calculated is taking a (for example) 1 second period, in which you observe how long the CPU has been idling (not executing any processes), and divide that by the time interval you selected. For example, if the CPU did useful calculations for 10 milliseconds, and you were observing for 500ms, this would mean that the CPU utilization is 2%.
Answering your question / TL; DR
You can apply this principle in your program. For the case you provided (processing video), this could be done in more or less the same way: you calculate how long it takes to calculate one frame, and divide that by the length of a frame (1 / FPS). Of course, this could be done for a longer period of time, to get a more accurate reading, in the following way: you track how much time it takes to process, for example, 2 seconds of video, and divide that by 2. Then, you'll have your CPU utilization.
NOTE: if you aren't able to process the frame in time, for example, your video is 10FPS (0.1ms), and processing one frame takes 0.5ms, then your CPU utilization will be seemingly 500%, but obviously you can't utilize more than 100% of your CPU, so you should just cap the CPU utilization at 100%.

How are CPU resource units (millicore/millicpu) calculated under the hood?

Let's take this processor as an example: a CPU with 2 cores and 4 threads (2 threads per core).
From what I've read, such a CPU has 2 physical cores but can process 4 threads simultaneously through hyper threading. But, in reality, one physical core can only truly run one thread at a time, but using hyper threading, the CPU exploits the idle stages in the pipeline to process another thread.
Now, here is Kubernetes with Prometheus and Grafana and their CPU resource units measurement - millicore/millicpu. So, they virtually slice a core to 1000 millicores.
Taking into account the hyper threading, I can't understand how they calculate those millicores under the hood.
How can a process, for example, use 100millicore (10th part of the core)? How is this technically possible?
PS: accidentally, found a really descriptive explanation here: Multi threading with Millicores in Kubernetes
This gets very complicated. So k8s doesn't actually manage this it just provides a layer on top of the underlying container runtime (docker, containerd etc). When you configure a container to use 100 millicore k8's hands that down to the underlying container runtime and the runtime deals with it. Now once you start going to this level you have to start looking at the Linux kernel and how it does cpu scheduling / rate with cgroups. Which becomes incredibly interesting and complicated. In a nutshell though: The linux CFS Bandwidth Control is the thing that manages how much cpu a process (container) can use. By setting the quota and period params to the schedular you can control how much CPU is used by controlling how long a process can run before being paused and how often it runs. as you correctly identify you cant only use a 10th of a core. But you can use a 10th of the time and by doing that you can only use a 10th of the core over time.
For example
if I set quota to 250ms and period to 250ms. That tells the kernel that this cgroup can use 250ms of CPU cycle time every 250ms. Which means it can use 100% of the CPU.
if I set quota to 500ms and keep the period to 250ms. That tells the kernel that this cgroup can use 500ms of CPU cycle time every 250ms. Which means it can use 200% of the CPU. (2 cores)
if I set quota to 125ms and keep the period to 250ms. That tells the kernel that this cgroup can use 125ms of CPU cycle time every 250ms. Which means it can use 50% of the CPU.
This is a very brief explanation. Here is some further reading:
https://blog.krybot.com/a?ID=00750-cfae57ed-c7dd-45a2-9dfa-09d42b7bd2d7
https://www.kernel.org/doc/html/latest/scheduler/sched-bwc.html

Query on kubernetes metrics-server metrics values

I am using metrics-server(https://github.com/kubernetes-incubator/metrics-server/) to collect the core metrics from containers in a kubernetes cluster.
I could fetch 2 resource usage metrics per container.
cpu usage
memory usage
However its not clear to me whether
these metrics are accumulated over time or they are already sampled for a particular time window(1 minute/ 30 seconds..)
What are the units for the above metric values. For CPU usage, is it the number of cores or milliseconds? And for memory usage i assume its the bytes usage.
While computing CPU usage metric value, does metrics-server already take care of dividing the container usage by the host system usage?
Also, if i have to compare these metrics with the docker-api metrics, how to compute CPU usage % for a given container?
Thanks!
Metrics are scraped periodically from kubelets. The default resolution duration is 60s, which can be overriden with the --metric-resolution=<duration> flag.
The value and unit (cpu - cores in decimal SI, memory - bytes in binary SI) are arrived at by using the Quantity serializer in the k8s apimachinery package. You can read about it from the comments in the source code
No, the CPU metric is not relative to the host system usage as you can see that it's not a percentage value. It represents the rate of change of the total amount of CPU seconds consumed by the container by core. If this value increases by 1 within one second, the pod consumes 1 CPU core (or 1000 milli-cores) in that second.
To arrive at a relative value, depending on your use case, you can divide the CPU metric for a pod by that for the node, since metrics-server exposes both /pods and /nodes endpoints.

Amazon RDS Strange metrics CPU Credit Balance

We have RDS PostgreSQL instance with type db.t2.small . And we have some strange moment with cpu credit balance metrics .
CPU Credit Usage not growing but balance is down to zero . Anybody know in what could be problem? (RDS instance working fine without any problems).
I am seeing the same behavior with my T2-micro free tier RDS instance. My hypothesis right now is that the service window is when the instance is getting rebooted or hot swapped, resulting in a new instance with the default baseline number of credits. This makes Saturday night more appealing than Sunday night in order to be sure by the next business day credits re-accumulate.
From the documentation, it looks like CPU credits expire 24 hours after being earned.
CPUCreditUsage
[T2 instances] The number of CPU credits consumed by the instance. One
CPU credit equals one vCPU running at 100% utilization for one minute
or an equivalent combination of vCPUs, utilization, and time (for
example, one vCPU running at 50% utilization for two minutes or two
vCPUs running at 25% utilization for two minutes).
CPU credit metrics are available only at a 5 minute frequency. If you
specify a period greater than five minutes, use the Sum statistic
instead of the Average statistic.
Units: Count
CPUCreditBalance
[T2 instances] The number of CPU credits available for the instance to
burst beyond its base CPU utilization. Credits are stored in the
credit balance after they are earned and removed from the credit
balance after they expire. Credits expire 24 hours after they are
earned.
CPU credit metrics are available only at a 5 minute frequency.
Units: Count
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/rds-metricscollected.html

Xcode Instruments CPU time

if i run an application with the performance test, the "cpu monitor" show me some informations like process ID/Name or CPU Time. But in which unit of time does it measure ?
An example: if i get 05.04 , what does mean for me
Best Regards
Plagiarized from http://en.wikipedia.org/wiki/CPU_time -
CPU time (or CPU usage, process time) is the amount of time for which a central processing unit (CPU) was used for processing instructions of a computer program, as opposed to, for example, waiting for input/output (I/O) operations.
The CPU time is often measured in clock ticks or seconds. CPU time is also mentioned as percentage of the CPU's capacity at any given time on multi-tasking environment. That helps in figuring out how a CPU’s computational power is being shared among multiple computer programs.