No cpu metrics from running pods on stackdriver - kubernetes

Hi im trying to setup stackdriver to monitor my containers but the cpu metrics dont seem to work, im working with the following versions
Master Version 1.2.5
Node Version 1.2.4
heapster-v1.0.2-594732231-sil32
this is a group a create for the databases (it also happens for the wildfly pod and modcluster), i have a couple of other questions,
is it posible to monitor postgres or i have to install the agent on
the docker image
can i monitor the images on kubernetes, or the disks on Google cloud?

Do your containers have CPU limits specified on them? The CPU Usage graph on that page is supposed to show utilization, which is defined as cores used / cores reserved. If a container hasn't specified a maximum number of cores, then it won't have a utilization either, as mentioned in the description of the CPU utilization metric.

Related

Why the CPU usage of a GKE Workload is not equal to the sum of the CPU usage of its pods?

I'm trying to figure out why a GKE "Workload" CPU usage is not equivalent to the sum of cpu usage of its pods.
Following image shows a Workload CPU usage.
Service Workload CPU Usage
Following images show pods CPU usage for the above Workload.
Pod #1 CPU Usage
Pod #2 CPU Usage
For example, at 9:45, the Workload cpu usage was around 3.7 cores, but at the same time Pod#1 CPU usage was around 0.9 cores and Pod#2 CPU usage was around 0.9 cores too. It means, the service Workload CPU Usage should have been around 1.8 cores, but it wasn't.
Does anyone have an idea of this behavior?
Thanks.
On your VM, the node managed by Kubernetes, you have the deployed pods (that you manage) but also several services that run on it for the supervision, the management, the logs ingestion,... A basic description here
You can see all these basic services by performing this command kubctl get all --namespace kube-system.
If you have installed additional components, like Istio or Knative, you have additional services and namespaces. All of these get a part of the resources of the node.
Danny,
The CPU chart on the Workloads page is an aggregate of CPU usage for managed pods. The values are taken from the Stackdriver Monitoring metric container/cpu/usage_time, check this link. That metric represents "Cumulative CPU usage on all cores in seconds. This number divided by the elapsed time represents usage as a number of cores, regardless of any core limit that might be set."
Please let me know if you have further questions in regard to this.
I suspect this is a bug in the UI. There is no actual metric for deployment CPU usage. Stackdriver Monitoring only collects data on container, pod, and node level metrics thus the only really reliable metrics in this case are the ones for pod CPU usage.
The graph for the total deployment CPU usage is likely meant to be a sum of all the pods metrics calculated and then presented to you. It is not as reliable as the pod or container metrics since it is not a direct metric.
If you are seeing this discrepancy consistently, I recommend opening a UI bug report through the Google Public Issue Tracker to report this to the GCP Engineers.

Monitoring Rancher containers by hosts through Prometheus cAdvisor NodeExporter

I have a setup where I manage to monitor every container of my Rancher 1.6 environnement with a stack Prometheus(2.4.3)/Grafana (with cAdvisor v0.27.4 and NodeExporter v0.16.0).
Here is my issue. I manage to monitor every container consumption but I can't relate the consumption of a container based on the host.
For example, if I want to show information about CPU usage I use container_cpu_user_seconds_total from cAdvisor which provides cpu usage of the container in percentage related to its host but I can't find which host is concerned (I Have 4 hosts on this environnement) so the cpu consumption cumulative tends to go over 100%.
I would like to either show charts by host (I saw I could create dynamic charts in Grafana but it seems a bit hard so manually creating them doesn't bother me).
Should I try to create my own metrics in prom-conf file ? Seems a bit overkill for such stuff
I find this very strange that this information only interests me. That's why I ask it here.
I'm new to all of these tools.
Thank you in advance.

Kubernetes Cluster with different CPU configuration

I have created a K8S cluster of 10 machines. which is having cpus of different memory and cores (4 core 32 GB, 4 core 8 GB). Now when I am deploying any application on the cluster it is creating pods in a random manner. It is not creating the POD on the basis of memory or load.
How is Kubernetes master distributing the Pods in the cluster? I am not getting any significant answers. How can i configure the cluster for best use of resources?
Kubernetes uses a scheduler for deciding which pod is started on which node. One improvement is to tell the scheduler what your pods need as minimum and maximum resources.
Resources are Memory (measured in bytes), CPU (measured in cpu units) and ephemeral storage for things like emtpy dir(with 1.11). When you provide these information for your deployments Kubernetes can make better decisions where to run.
Without these information a nginx pod will be scheduled the same way as any heavy Java application.
The limits and requests config is described here. Setting both limits is a good idea to make scheduling easier and to avoid pods running amok and using all node resources.
If this is not enough there is also the possibility to add a custom scheduler which is explained in this documentation

High CPU cores usage in Kubernetes cluster (84%)

I've set up Prometheus and Grafana for tracking and monitoring of my Kubernetes cluster.
I've set up 3 Nodes for my cluster.
I have 26 pods running (mostly monitoring namespace).
I have one major Node app (deployment) running and right now there isn't any load.
I'm trying to understand these graph metrics. However I can't understand why there's such high CPU cores usage despite there being no load on the app.
Here's a grafana screenshot
24% memory usage I can understand as there are Kubernetes processes running as well such as kube-system etc.
And it's also telling me my cluster can support 330 pods (currently 26). I'm only worried about high cpu cores. Can anybody explain it.
82% is not the CPU usage of the processes but the ratio of requested to allocatable resources (2.31 / 2.82 = 0.819 --> ~82%).
This means that out of your 2.82 available (allocatable CPUs), you requested (allocated) about 82% for pods in the monitoring namespace but that does not mean they actually use that much CPU.
To see actual CPU usage, look at metrics like container_cpu_usage_seconds_total (per container CPU usage) or maybe even process_cpu_seconds_total (per process CPU usage).

Kubernetes Heapster not working correctly

I set up kubernetes-dashboard and heapster according to the docs in the github. My kubernetes client and server version are both 1.5.4 and the whole cluster is deployed on 10 physical servers with ubuntu 14.04 OS. I used heapster 1.3.
I can access the dashboard and see some figures about cpu and memory usage, but I do not know why some pods do not have such figures, i.e., the cpu and memory usage of some pods are not displayed in figure format. The two pictures below are examples.
dashboard display 1
dashboard display 2
Also, I find that sometimes for a pod, it only displays memory, without cpu usage.
I checked cAdvisor on each node by accessing its web ui, and it works quite well on each node. This problem troubles me several days. Can someone help me figure out this issue or give some hints? I'd be very grateful to you.