do i need kubernertes-cadvisor up to monitor kubernetes - kubernetes

I've setup Prometheus to monitor Kubernetes. However when i watch the Prometheus dashboard I see kubernetes-cadvisor DOWN
I would want to know if we need it to monitor Kubernetes because on Grafana i already get different information as memory usage, disk space ...
Would it be used to monitor containers in order to make precise requests such as the use of memory used by a pod of a specific namespace?

The error you have provided means that the cAdvisor's content does not comply with the Prometheus exposition format.[1] But to be honest, it is one of the possibilities and as you did not provide more information we will have to leave it for now (I mean the information asked by Oliver + versions of Prometheus and Grafana and environment in which you are running the cluster).
Answering your question, although you don't need to use cAdvisor for monitoring, it does provide some important metrics and is pretty well integrated with Kubernetes. So until you need container level metrics, then you should use cAdvisor.
As specified in this article(you can find configuration tutorial there):
you can’t access cAdvisor directly (through 4194). You can (!) access
cAdvisor by duplicating the job_name (called “k8s”) in the
prometheus.yml file, calling the copy “cAdvisor” (perhaps) and
inserting an additional line to define “metrics_path”. Prometheus
assumes exporters are on “/metrics” but, for cAdvisor, our metrics are
on “/metrics/cadvisor”.
I think that could be the reason, but if this does not solve your issue I will try to recreate it in my cluster.
Update:
Judging from your yaml file, you did not configure Prometheus to scrape metrics from the cAdvisor. Add this to your yaml file:
scrape_configs:
- job_name: cadvisor
scrape_interval: 5s
static_configs:
- targets:
- cadvisor:8080
As specified here.

To get the metrics of container we need CADVISOR !!
to setup it i just follow the procedure below
https://github.com/google/cadvisor
i installed it on each of my nodes !
i run on each
sudo docker run \
--volume=/:/rootfs:ro \
--volume=/var/run:/var/run:ro \
--volume=/sys:/sys:ro \
--volume=/var/lib/docker/:/var/lib/docker:ro \
--volume=/dev/disk/:/dev/disk:ro \
--publish=8080:8080 \
--detach=true \
--name=cadvisor \
google/cadvisor:latest
i hope this will help you guys ;)

Related

Prometheus - Monitoring command output in container

I need to monitoring a lot of legacy containers in my eks cluster that having a nfs mountpath. To map nfs directory in container i using nfs-client helm chart.
I need to monitor when my mountpath for some reason is lost, and the only way that i find to do that is exec a command in container.
#!/bin/bash
df -h | grep ip_of_my_nfs_server | wc -l
if the output above returns 1 i know that my nfs mountpath is ok.
Anybody knows some whay that monitoring an output script exec in container with prometheus?
Thanks!
As Matt has pointed out in the comments: first order of business should be to see if you can simply facilitate your monitoring requirement from node_exporter.
Below is a more generic answer on collecting metrics from arbitrary shell commands.
Prometheus is a pull-based monitoring system. You configure it with "scrape targets": these are effectively just HTTP endpoints that expose metrics in a specific format. Some target needs to be alive for long enough to allow it to be scraped.
The two most obvious options you have are:
Wrap your logic in a long-running process that exposes this metric on an HTTP endpoint, and configure it as a scrape target
Spin up an instance of pushgateway, and configure it as a scrape target , and have your command push its metrics there
Based on the little information you provided, the latter option seems like the most sane one. Important and relevant note from the README:
The Prometheus Pushgateway exists to allow ephemeral and batch jobs to expose their metrics to Prometheus. Since these kinds of jobs may not exist long enough to be scraped, they can instead push their metrics to a Pushgateway. The Pushgateway then exposes these metrics to Prometheus.
Your command would look something like:
#!/bin/bash
printf "mount_path_up %d" $(df -h | grep ip_of_my_nfs_server | wc -l) | curl --data-binary #- http://pushgateway.example.org:9091/metrics/job/some_job_name

Prometheus metrics Configuration

I'm pretty new to Prometheus and according to my understanding, there are many metrics already available in Prometheus. But I'm not able to see "http_requests_total" which is used in many examples in the list. Do we need to configure anything in order to avail these HTTP metrics?
My requirement is to calculate the no: of HTTP requests hitting the server at a time. So http_request_total or http_requests_in_flight metrics would be of great help for usage.
Can someone please guide me here on what to do next?
The documentation is extensive and helpful.
See installation
If you have Docker, you can simply run:
docker run \
--interactive --tty --rm \
--publish=9090:9090 \
prom/prometheus
And then browse: http://localhost:9090.
The default config is set to scrape itself.
You can list these metrics.
And graph prometheus_http_requests_total them.

how kubelet calculate nodefs,imagefs? Then evict a Pod

How can I know kubelet will check which folders or any restful api I can invoke to know the current usage of nodefs and imagefs, and free usage?
--eviction-hard=memory.available<200Mi,nodefs.available<5%,imagefs.available<5% \
--eviction-soft=memory.available<500Mi,nodefs.available<10%,imagefs.available<10% \
--eviction-soft-grace-period=memory.available=2m,nodefs.available=240h,imagefs.available=240h \
Kubernetes get that info from a container engine, it most cases - docker. It do not check any folders or files, it get info on filesystem level.
To get that data, you can use Prometheus together with Node Exporter or CAdvisor.

What is the difference between the core os projects kube-prometheus and prometheus operator?

The github repo of Prometheus Operator https://github.com/coreos/prometheus-operator/ project says that
The Prometheus Operator makes the Prometheus configuration Kubernetes native and manages and operates Prometheus and Alertmanager clusters. It is a piece of the puzzle regarding full end-to-end monitoring.
kube-prometheus combines the Prometheus Operator with a collection of manifests to help getting started with monitoring Kubernetes itself and applications running on top of it.
Can someone elaborate this?
I've always had this exact same question/repeatedly bumped into both, but tbh reading the above answer didn't clarify it for me/I needed a short explanation. I found this github issue that just made it crystal clear to me.
https://github.com/coreos/prometheus-operator/issues/2619
Quoting nicgirault of GitHub:
At last I realized that prometheus-operator chart was packaging
kube-prometheus stack but it took me around 10 hours playing around to
realize this.
**Here's my summarized explanation:
"kube-prometheus" and "Prometheus Operator Helm Chart" both do the same thing:
Basically the Ingress/Ingress Controller Concept, applied to Metrics/Prometheus Operator.
Both are a means of easily configuring, installing, and managing a huge distributed application (Kubernetes Prometheus Stack) on Kubernetes:**
What is the Entire Kube Prometheus Stack you ask? Prometheus, Grafana, AlertManager, CRDs (Custom Resource Definitions), Prometheus Operator(software bot app), IaC Alert Rules, IaC Grafana Dashboards, IaC ServiceMonitor CRDs (which auto-generate Prometheus Metric Collection Configuration and auto hot import it into Prometheus Server)
(Also when I say easily configuring I mean 1,000-10,000++ lines of easy for humans to understand config that generates and auto manage 10,000-100,000 lines of machine config + stuff with sensible defaults + monitoring configuration self-service, distributed configuration sharding with an operator/controller to combine config + generate verbose boilerplate machine-readable config from nice human-readable config.
If they achieve the same end goal, you might ask what's the difference between them?
https://github.com/coreos/kube-prometheus
https://github.com/helm/charts/tree/master/stable/prometheus-operator
Basically, CoreOS's kube-prometheus deploys the Prometheus Stack using Ksonnet.
Prometheus Operator Helm Chart wraps kube-prometheus / achieves the same end result but with Helm.
So which one to use?
Doesn't matter + they achieve the same end result + shouldn't be crazy difficult to start with 1 and switch to the other.
Helm tends to be faster to learn/develop basic mastery of.
Ksonnet is harder to learn/develop basic mastery of, but:
it's more idempotent (better for CICD automation) (but it's only a difference of 99% idempotent vs 99.99% idempotent.)
has built-in templating which means that if you have multiple clusters you need to manage / that you want to always keep consistent with each other. Then you can leverage ksonnet's templating to manage multiple instances of the Kube Prometheus Stack (for multiple envs) using a DRY code base with lots of code reuse. (If you only have a few envs and Prometheus doesn't need to change often it's not completely unreasonable to keep 4 helm values files in sync by hand. I've also seen Jinja2 templating used to template out helm values files, but if you're going to bother with that you may as well just consider ksonnet.)
Kubernetes operator are kubernetes specific application(pods) that configure, manage and optimize other Kubernetes deployments automatically. They are implemented as a custom controller.
According to official coreOS website:
Operators were introduced by CoreOS as a class of software that operates other software, putting operational knowledge collected by humans into software.
The prometheus operator provides the easy way to deploy configure and monitor your prometheus instances on kubernetes cluster. To do so, prometheus operator introduces three types of custom resource definition(CRD) in kubernetes.
Prometheus
Alertmanager
ServiceMonitor
Now, with the help of above CRD's, you can directly create a prometheus instance by providing kind: Prometheus and the prometheus instance is ready to serve, likewise you can do for AlertManager. Without this you would have to setup the deployment for prometheus with its image, configuration and many more things.
The Prometheus Operator serves to make running Prometheus on top of Kubernetes as easy as possible, while preserving Kubernetes-native configuration options.
Now, kube-prometheus implemented the prometheus operator and provides you minimum yaml files to create your basic setup of prometheus, alertmanager and grafana by running a single command.
git clone https://github.com/coreos/prometheus-operator.git
kubectl apply -f prometheus-operator/contrib/kube-prometheus/manifests/
By running above command in kube-prometheus directory, you will get a monitoring namespace which will have an instance of alertmanager, prometheus and grafana for UI. This is enough setup for most of the basic implementation and if you need any more specifics according to your application, you can add more yamls of exporter you need.
Kube-prometheus is more of a contribution to prometheus-operator project, which implements the prometheus operator functionality very well and provide you a complete monitoring setup for your kubernetes cluster. You can start with kube-prometheus and extend the functionality of your monitoring setup according to your application from there.
You can learn more about prometheus-operator here
As of today, 28-09-2020, this is the way to install Prometheus in a Kubernetes cluster
https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack#kube-prometheus-stack
According to official documentation, kube-prometheus-stack is a rename of prometheus-operator.
As I understood, kube-prometheus-stack also has preinstalled grafana dashboards and prometheus rules.
Note: This chart was formerly named prometheus-operator chart, now
renamed to more clearly reflect that it installs the kube-prometheus
project stack, within which Prometheus Operator is only one component.
Taken from https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack
Architecturally the container runs docker
The default container logs are managed by Docker, and the default log driver uses JSON-file
log-driver": "json-file
https://docs.docker.com/config/containers/logging/configure/
If the default jSON-file is used to manage container logs, log rotation is not performed by default. Therefore, the default JSON-file log driver the log files stored by the log driver can result in a large amount of disk space for containers that generate a large amount of output, which can cause disk space to run out.
In this case, save the log to ES, store it separately, and periodically delete the index using curator kubernetes
And run a scheduled task in K8S to delete the index periodically
Another solution for disk space is to periodically delete old logs from jSON-files
Typically we set the size and number of logs
This will set up a maximum of 10 log files, each with a maximum size of 20 Mb. Therefore, the container has a maximum of 200 Mb of logs
"log-driver": "json-file", "log-opts": { "max-size": "20m", "max-file": "10" },
Note: In general, the default Docker log is placed
/var/lib/docker/containers/
But in the same case kubernetes also saves logs and creates a directory structure to help you find pods-based logs, so you can find container logs for each Pod running on a node
/var/log/pods/<namespace>_<pod_name>_<pod_id>/<container_name>/
When removing pod, / var/lib/container under the docker/containers/log and k8s created under/var/log/pods/pod log will be deleted
For example, if the POD is restarted during production, the pod log will be deleted whether it is on the original node or jumped to another node
Therefore, this log needs to be saved in ES for centralized management. Many R&D projects will check the log for troubleshooting in most cases

No kafka metrics in Grafana/prometheus

I successfully deployed helm chart prometheus operator, kube-prometheus and kafka (tried both image danielqsj/kafka_exporter v1.0.1 and v1.2.0).
Install with default value mostly, rbac are enabled.
I can see 3 up nodes in Kafka target list in prometheus, but when go in Grafana, I can's see any kafka metric with kafka overview
Anything I missed or what I can check to fix this issue?
I can see metrics start with java_, kafka_, but no jvm_ and only few jmx_ metrics.
I found someone reported similar issue (https://groups.google.com/forum/#!searchin/prometheus-users/jvm_%7Csort:date/prometheus-users/OtYM7qGMbvA/dZ4vIfWLAgAJ), So I deployed with old version of jmx exporter from 0.6 to 0.9, still no jvm_ metrics.
Are there anything I missed?
env:
kuberentes: AWS EKS (kubernetes version is 1.10.x)
public grafana dashboard: kafka overview
Just realised the owner of jmx-exporter mentioned in README:
This exporter is intended to be run as a Java Agent, exposing a HTTP server and serving metrics of the local JVM. It can be also run as an independent HTTP server and scrape remote JMX targets, but this has various disadvantages, such as being harder to configure and being unable to expose process metrics (e.g., memory and CPU usage). Running the exporter as a Java Agent is thus strongly encouraged.
Not really understood what's that meaning, until I saw this comment:
https://github.com/prometheus/jmx_exporter/issues/111#issuecomment-341983150
#brian-brazil can you add some sort of tip to the readme that jvm_* metrics are only exposed when using the Java agent? It took me an hour or two of troubleshooting and searching old issues to figure this out, after playing only with the HTTP server version. Thanks!
So jmx-exporter has to be run with java agent to get jvm_ metric. jmx_prometheus_httpserver doesn't support, but it is the default setting in kafka helm chart.
https://github.com/kubernetes/charts/blob/master/incubator/kafka/templates/statefulset.yaml#L82
command:
- sh
- -exc
- |
trap "exit 0" TERM; \
while :; do \
java \
-XX:+UnlockExperimentalVMOptions \
-XX:+UseCGroupMemoryLimitForHeap \
-XX:MaxRAMFraction=1 \
-XshowSettings:vm \
-jar \
jmx_prometheus_httpserver.jar \ # <<< here
{{ .Values.prometheus.jmx.port | quote }} \
/etc/jmx-kafka/jmx-kafka-prometheus.yml & \
wait $! || sleep 3; \
done
You have to turn on jmx and exporter for kafka helm chart providing --set prometheus.jmx.enabled=true,prometheus.kafka.enabled=true. The values are false per default.