Calculate value in Group By statement - grafana

Use case:
I have 10 Kubernetes nodes (consider them as VMs) which have between 7 and 14 allocatable CPU cores which can be requested by Kubernetes pods. Therefore I'd like to show a table which shows the
Allocatable CPU cores
The requested CPU cores
The ratio of requested / allocatable CPU cores
grouped by node.
The problem
Creating the table for the first 2 requirements was easy. I simply created a table in Grafana and added these two metrics:
sum(kube_pod_container_resource_requests_cpu_cores) by (node)
sum(kube_node_status_allocatable_cpu_cores) by (node)
However I was struggling with the third one. I tried this query, but it didn't return any data apparently:
sum(kube_pod_container_resource_requests_cpu_cores / kube_node_status_allocatable_cpu_cores) by (node)
Question
How can I achieve a calculation of two different metrics in a group by statement in my given example?

The issue here is that the two have different labels, so you need to aggregate away the extras:
sum by (node)(kube_pod_container_resource_requests_cpu_cores)
/
sum by (node)(kube_node_status_allocatable_cpu_cores)

Related

Average number of cores and memory used per namespace in K8s?

Need to calculate the average number of cores and memory used over a period of time (like say a month) for a particular namespace in K8s, how can we go about doing this?
We want to calculate the cost for each namespace, we did try the Kubecost tool in AKS, but it didn't match the cost shown on the Azure Cost dashboard, in fact, it was way more than the actual cost.

Erasure Coded Pool suggested PG count

I'm messing around with pg calculator to figure out the best pg count for my cluster. I have an erasure coded FS pool which will most likely use half space of the cluster in the forseeable future. But the pg calculator only has options for replicated pools. Should i just type according to the erasure-code ratio for replica # or is there another way around this?
From Ceph Nautilus version onwards there's a pg-autoscaler that does the scaling for you. You just need to create a pool with an initial (maybe low) value. As for the calculation itself your assumption is correct, you take the number of chunks into account when planning the pg count.
From :
redhat docs:
3.3.4. Calculating PG Count
If you have more than 50 OSDs, we recommend approximately 50-100 placement groups per OSD to balance out resource usage, data durability and distribution. If you have less than 50 OSDs, choosing among the PG Count for Small Clusters is ideal. For a single pool of objects, you can use the following formula to get a baseline:
(OSDs * 100)
Total PGs = ------------
pool size
Where pool size is either the number of replicas for replicated pools or the K+M sum for erasure coded pools (as returned by ceph osd erasure-code-profile get).
You should then check if the result makes sense with the way you designed your Ceph cluster to maximize data durability, data distribution and minimize resource usage.
The result should be rounded up to the nearest power of two. Rounding up is optional, but recommended for CRUSH to evenly balance the number of objects among placement groups.
For a cluster with 200 OSDs and a pool size of 3 replicas, you would estimate your number of PGs as follows:
(200 * 100)
----------- = 6667. Nearest power of 2: 8192
3
With 8192 placement groups distributed across 200 OSDs, that evaluates to approximately 41 placement groups per OSD. You also need to consider the number of pools you are likely to use in your cluster, since each pool will create placement groups too. Ensure that you have a reasonable maximum PG count.

How to monitor windows manchine in grafana using prometheus?

I am monitoring windows machine and i installed wmi exporter in my machine. I am using prometheus and grafana as monitoring tools. which query i should use to monitor the CPU status of my windows machine
This gets you the percentege of CPU use.
100 - (avg by (instance) (irate(wmi_cpu_time_total{mode="idle", instance=~"$server.*"}[1m])) * 100)
I don't have a WMI exporter running , but according to its documentation something like this should work with a stacked graph:
sum by(mode) (rate(wmi_cpu_time_total[5m]))
You can add labels to the metric to filter by instance / job / whatever and you can tweak the range that you compute the rate over (e.g. 1m for less smoothing; 1h over longer ranges of time; or Grafana's $__interval for dashboard range + screen resolution dependent graphing).
Edit: the query above would give you CPU usage in absolute terms, i.e. if your machine had 4 cores, the stacked graph would add up to (approximately) 4 or 400%. If you want it to instead add up to exactly 100% you should use something like this (not tested):
sum by(mode) (rate(wmi_cpu_time_total[5m]))
/
scalar(sum(rate(wmi_cpu_time_total[5m]))
All it does is it divides each per-CPU-mode value by their sum, so the results will always sum up to 1. All you need to do in Grafana is select the unit of measurement to be "percentage (0-1)".

Query on kubernetes metrics-server metrics values

I am using metrics-server(https://github.com/kubernetes-incubator/metrics-server/) to collect the core metrics from containers in a kubernetes cluster.
I could fetch 2 resource usage metrics per container.
cpu usage
memory usage
However its not clear to me whether
these metrics are accumulated over time or they are already sampled for a particular time window(1 minute/ 30 seconds..)
What are the units for the above metric values. For CPU usage, is it the number of cores or milliseconds? And for memory usage i assume its the bytes usage.
While computing CPU usage metric value, does metrics-server already take care of dividing the container usage by the host system usage?
Also, if i have to compare these metrics with the docker-api metrics, how to compute CPU usage % for a given container?
Thanks!
Metrics are scraped periodically from kubelets. The default resolution duration is 60s, which can be overriden with the --metric-resolution=<duration> flag.
The value and unit (cpu - cores in decimal SI, memory - bytes in binary SI) are arrived at by using the Quantity serializer in the k8s apimachinery package. You can read about it from the comments in the source code
No, the CPU metric is not relative to the host system usage as you can see that it's not a percentage value. It represents the rate of change of the total amount of CPU seconds consumed by the container by core. If this value increases by 1 within one second, the pod consumes 1 CPU core (or 1000 milli-cores) in that second.
To arrive at a relative value, depending on your use case, you can divide the CPU metric for a pod by that for the node, since metrics-server exposes both /pods and /nodes endpoints.

Resource Allocation in Kubernetes: How are pods scheduled?

In Kubernetes, the role of the scheduler is to seek a suitable node for the pods. So, after assigning a pod into a node, there are different pods on that node so that those pods are competing to gain resources. Therefore, for this competitive situation, how Kubernetes allocates resource? Is there any source code in Kubernetes for computing resource allocation?
I suppose you can take a look at the below articles to see if that answers your query
https://github.com/kubernetes/community/blob/master/contributors/devel/sig-scheduling/scheduler_algorithm.md#ranking-the-nodes
https://jvns.ca/blog/2017/07/27/how-does-the-kubernetes-scheduler-work/
The filtered nodes are considered suitable to host the Pod, and it is often that there are more than one nodes remaining. Kubernetes prioritizes the remaining nodes to find the "best" one for the Pod. The prioritization is performed by a set of priority functions. For each remaining node, a priority function gives a score which scales from 0-10 with 10 representing for "most preferred" and 0 for "least preferred". Each priority function is weighted by a positive number and the final score of each node is calculated by adding up all the weighted scores. For example, suppose there are two priority functions, priorityFunc1 and priorityFunc2 with weighting factors weight1 and weight2 respectively, the final score of some NodeA is:
finalScoreNodeA = (weight1 * priorityFunc1) + (weight2 * priorityFunc2)
After the scores of all nodes are calculated, the node with highest score is chosen as the host of the Pod. If there are more than one nodes with equal highest scores, a random one among them is chosen.
Currently, Kubernetes scheduler provides some practical priority functions, including:
LeastRequestedPriority: The node is prioritized based on the fraction of the node that would be free if the new Pod were scheduled onto the node. (In other words, (capacity - sum of requests of all Pods already on the node - request of Pod that is being scheduled) / capacity). CPU and memory are equally weighted. The node with the highest free fraction is the most preferred. Note that this priority function has the effect of spreading Pods across the nodes with respect to resource consumption.
CalculateNodeLabelPriority: Prefer nodes that have the specified label.
BalancedResourceAllocation: This priority function tries to put the Pod on a node such that the CPU and Memory utilization rate is balanced after the Pod is deployed.
CalculateSpreadPriority: Spread Pods by minimizing the number of Pods belonging to the same service on the same node. If zone information is present on the nodes, the priority will be adjusted so that pods are spread across zones and nodes.
CalculateAntiAffinityPriority: Spread Pods by minimizing the number of Pods belonging to the same service on nodes with the same value for a particular label.