How can I retrieve the memory utilization of a pod in kubernetes via kubectl? - kubernetes

Inside a namespace, I have created a pod with its specs consisting of memory limit and memory requests parameters. Once up a and running, I would like to know how can I get the memory utilization of the pod in order to figure out if the memory utilization is within the specified limit or not. "kubectl top" command returns back with a services related error.

kubectl top pod <pod-name> -n <fed-name> --containers
FYI, this is on v1.16.2

You need to install metrics server to get the metrics. Follow the below thread
Error from server (NotFound): podmetrics.metrics.k8s.io "mem-example/memory-demo" not found

kubectl top pod POD_NAME --containers
shows metrics for a given pod and its containers.
If you want to see graphs of memory and cpu utilization then you can see them through the kubernetes dashboard.
A better solution would be to install a metrics server alongwith prometheus and grafana in your cluster. Prometheus will scrape the metrics which can be used by grafana for displaying as graphs.
This might be useful.

Instead of building ad-hoc metric snapshots, a much better way is to install and work with 3rd party data collector programs which if managed well gives you a great solution for monitoring systems and a neat Grafana UI (or likewise) you can play with. One of them is Prometheus and which comes highly recommended.
using such PnP systems, you can not only create a robust monitoring pipeline but also the consumption and hence the reaction to the problem is well managed and executed compared to only relying on TOP

Related

Analyse kubernetes pod network traffic

All,
Off late my cloud provider is charging more for data transfer. Finally noticed one of the K8 pod is having more data transfer. Is there a way i can find out pod level network traffic such us how much transmitted and received bytes with native kubernetes command?
Thanks
Bala
kubectl top command shows usage, not allocation. Allocation is what causes the insufficient CPU problem. There's a ton of confusion in this issue about the difference.
AFAICT, there's no easy way to get a report of node CPU allocation by pod, since requests are per container in the spec. And even then, it's difficult since .spec.containers[*].requests may or may not have the limits/requests fields.
But there is something like kubectl-view-allocationswith it you can explore your kube resource usage and allocation. It can provide result grouped by namespaces, nodes, pods and filtered by resources'name.
kubectl-view-allocations [FLAGS] [OPTIONS]
FLAGS:
-h, --help Prints help information
-z, --show-zero Show lines with zero requested and zero limit and zero allocatable
-V, --version Prints version information
OPTIONS:
-g, --group-by <group-by>... Group informations (hierarchicaly) (default: -g resource -g node -g pod)
[possible values: resource, node, pod]
-n, --namespace <namespace> Show only pods from this namespace
-r, --resource-name <resource-name>... Filter resources shown by name(s), by default all resources are listed
In your case I think safest option is to install Heapster or metrics-server, cAdvisor and Grafana.
Heapster enables Container Cluster Monitoring and Performance Analysis for Kubernetes (versions v1.0.6 and higher), and platforms which include it.
Heapster collects and interprets various signals like compute resource usage - with this metrics you will find the problem wit htransfering more data by specific pods, lifecycle events, etc. Heapster supports multiple sources of data.
Container metrics are available mostly through cAdvisor. cAdvisor (Container Advisor) provides container users an understanding of the resource usage and performance characteristics of their running containers. It is a running daemon that collects, aggregates, processes, and exports information about running containers. Specifically, for each container it keeps resource isolation parameters, historical resource usage, histograms of complete historical resource usage and network statistics. This data is exported by container and machine-wide.
Grafana on the other hand allows you to query, visualize, alert on and understand gathered metrics no matter where they are stored. Create, explore, and share dashboards with your team and foster a data driven culture.
Take a look: kubernetes-metrics, metrics-server-installation.

Kubernetes - Monitoring pod IO

I would like to monitor the IO which my pod is doing. Using commands like 'kubectl top pods/nodes', i can monitor CPU & Memory. But I am not sure how to monitor IO which my pod is doing, especially disk IO.
Any suggestions ?
Since you already used kubectl top command I assume you have metrics server. In order to have more advanced monitoring solution I would suggest to use cAdvisor, Prometheus or Elasticsearch.
For getting started with Prometheus you can check this article.
Elastic search has System diskio and Docker diskio metrics set. You can easily deploy it using helm chart.
Part 3 of the series about kubernetes monitoring is especially focused on monitoring container metrics using cAdvisor. Allthough it is worth checking whole series.
Let me know if this helps.

Live monitoring of container, nodes and cluster

we are using k8s cluster for one of our application, cluster is owned by other team and we dont have full control over there… We are trying to find out metrics around resource utilization (CPU and memory), detail about running containers/pods/nodes etc. Need to find out how many parallel containers are running. Problem is they have exposed monitoring of cluster via Prometheus but with Prometheus we are not getting live data, it does not have info about running containers.
My query is , what is that API which is by default available in k8s cluster and can give all what we need. We dont want to read data form another client like Prometheus or anything else, we want to read metrics directly from cluster so that data is not stale. Any suggestions?
As you mentioned you will need metrics-server (or heapster) to get those information.
You can confirm if your metrics server is running kubectl top nodes/pods or just by checking if there is a heapster or metrics-server pod present in kube-system namespace.
Also the provided command would be able to show you the information you are looking for. I wont go into details as here you can find a lot of clues and ways of looking at cluster resource usage. You should probably take a look at cadvisor too which should be already present in the cluster. It exposes a web UI which exports live information about all the containers on the machine.
Other than that there are probably commercial ways of acheiving what you are looking for, for example SignalFx and other similar projects - but this will probably require the cluster administrator involvement.

How to get resources consumed by kubernetes job?

How to get resources (CPU and Memory) consumed by a kubernetes job at the end of job's lifecycle? Is this out of kubernetes job implementation's scope?
Notes:
kubectl describe job provides only the limit/request specified.
I am aware of external tools to capture the resource consumption. I'm looking for something that could be stored along with job metadata without using any external monitoring tools like prometheus.
I would not encourage you to only restrict yourself to kubectl top pod. This is only good for quick troubleshoot and sneak peek only.
In production, you must have a more concrete framework for resource usage monitoring and I have found Prometheus very useful. Of course, when you are working on GCP, you may choose native monitoring toolsets also.
we can get resource consumption of pods which are created by job using "kubectl top pod" command but pod must be live at that time.
But Once pods dies we don't have any way to collect resource consumption.
If you're not using an external tool, the only option which I can think of is running a sidecar container that logs the CPU and Memory usage at each time period.
When the tasks that were executed via the Job resource have finished there work, K8S will not delete the corresponding pods (you will see READY 0/1 and STATUS Completed) so you can view those logs and the status of the each pod.
(*) If you want to log resources usage compared to the given resources request and limit - you can use the Downward API and this information will be available to containers through environment variables.

Monitoring and alerting on pod status or restart with Google Container Engine (GKE) and Stackdriver

Is there a way to monitor the pod status and restart count of pods running in a GKE cluster with Stackdriver?
While I can see CPU, memory and disk usage metrics for all pods in Stackdriver there seems to be no way of getting metrics about crashing pods or pods in a replica set being restarted due to crashes.
I'm using a Kubernetes replica set to manage the pods, hence they are respawned and created with a new name when they crash. As far as I can tell the metrics in Stackdriver appear by pod-name (which is unique for the lifetime of the pod) which doesn't sound really sensible.
Alerting upon pod failures sounds like such a natural thing that it sounds hard to believe that this is not supported at the moment. The monitoring and alerting capabilities that I get from Stackdriver for Google Container Engine as they stand seem to be rather useless as they are all bound to pods whose lifetime can be very short.
So if this doesn't work out of the box are there known workarounds or best practices on how to monitor for continuously crashing pods?
You can achieve this manually with the following:
In Logs Viewer, creating the following filter:
resource.labels.project_id="<PROJECT_ID>"
resource.labels.cluster_name="<CLUSTER_NAME>"
resource.labels.namespace_name="<NAMESPACE, or default>"
jsonPayload.message:"failed liveness probe"
Create a metric by clicking on the Create Metric button above the filter input and filling up the details.
You may now track this metric in Stackdriver.
Would be happy to be informed of a built-in metric instead of this.
There is a built in metric now, so it's easy to dashboard and/or alert on it without setting up custom metrics
Metric: kubernetes.io/container/restart_count
Resource type: k8s_container
In my cluster (a bare-metal k8s cluster),I use kube-state-metrics https://github.com/kubernetes/kube-state-metrics to do what you want. This project belongs to kubernetes repo and it is quite easy to use. Once deployed u can use kube_pod_container_status_restarts this metrics to know if a container restarts
Others have commented on how to do this with metrics, which is the right solution if you have a very large number of crashing pods.
An alernative approach is to treat crashing pods as discrete events or even log-lines. You can do this with Robusta (disclaimer, I wrote this) with YAML like this:
triggers:
- on_pod_update: {}
actions:
- restart_loop_reporter:
restart_reason: CrashLoopBackOff
- image_pull_backoff_reporter:
rate_limit: 3600
sinks:
- slack
Here we're triggering an action named restart_loop_reporter whenever a pod updates. The data stream comes from the APIServer.
The restart_loop_reporter is an action which filters out non-crashing pods. Above it's configured to report only on CrashLoopBackOffs but you could remove that to report all crashes.
A benefit of doing it this way is that you can gather extra data about the crash automatically. For example, the above will fetch the pod's logs and forward them along with the crash report.
I'm sending the result here to Slack, but you could just as well send it to a structured output like Kafka (already builtin) or Stackdriver (not yet supported, but I can fix that if you like).
Remember that, you can always raise feature request if the options available are not enough.