No kafka metrics in Grafana/prometheus - apache-kafka

I successfully deployed helm chart prometheus operator, kube-prometheus and kafka (tried both image danielqsj/kafka_exporter v1.0.1 and v1.2.0).
Install with default value mostly, rbac are enabled.
I can see 3 up nodes in Kafka target list in prometheus, but when go in Grafana, I can's see any kafka metric with kafka overview
Anything I missed or what I can check to fix this issue?
I can see metrics start with java_, kafka_, but no jvm_ and only few jmx_ metrics.
I found someone reported similar issue (https://groups.google.com/forum/#!searchin/prometheus-users/jvm_%7Csort:date/prometheus-users/OtYM7qGMbvA/dZ4vIfWLAgAJ), So I deployed with old version of jmx exporter from 0.6 to 0.9, still no jvm_ metrics.
Are there anything I missed?
env:
kuberentes: AWS EKS (kubernetes version is 1.10.x)
public grafana dashboard: kafka overview

Just realised the owner of jmx-exporter mentioned in README:
This exporter is intended to be run as a Java Agent, exposing a HTTP server and serving metrics of the local JVM. It can be also run as an independent HTTP server and scrape remote JMX targets, but this has various disadvantages, such as being harder to configure and being unable to expose process metrics (e.g., memory and CPU usage). Running the exporter as a Java Agent is thus strongly encouraged.
Not really understood what's that meaning, until I saw this comment:
https://github.com/prometheus/jmx_exporter/issues/111#issuecomment-341983150
#brian-brazil can you add some sort of tip to the readme that jvm_* metrics are only exposed when using the Java agent? It took me an hour or two of troubleshooting and searching old issues to figure this out, after playing only with the HTTP server version. Thanks!
So jmx-exporter has to be run with java agent to get jvm_ metric. jmx_prometheus_httpserver doesn't support, but it is the default setting in kafka helm chart.
https://github.com/kubernetes/charts/blob/master/incubator/kafka/templates/statefulset.yaml#L82
command:
- sh
- -exc
- |
trap "exit 0" TERM; \
while :; do \
java \
-XX:+UnlockExperimentalVMOptions \
-XX:+UseCGroupMemoryLimitForHeap \
-XX:MaxRAMFraction=1 \
-XshowSettings:vm \
-jar \
jmx_prometheus_httpserver.jar \ # <<< here
{{ .Values.prometheus.jmx.port | quote }} \
/etc/jmx-kafka/jmx-kafka-prometheus.yml & \
wait $! || sleep 3; \
done

You have to turn on jmx and exporter for kafka helm chart providing --set prometheus.jmx.enabled=true,prometheus.kafka.enabled=true. The values are false per default.

Related

Prometheus - Monitoring command output in container

I need to monitoring a lot of legacy containers in my eks cluster that having a nfs mountpath. To map nfs directory in container i using nfs-client helm chart.
I need to monitor when my mountpath for some reason is lost, and the only way that i find to do that is exec a command in container.
#!/bin/bash
df -h | grep ip_of_my_nfs_server | wc -l
if the output above returns 1 i know that my nfs mountpath is ok.
Anybody knows some whay that monitoring an output script exec in container with prometheus?
Thanks!
As Matt has pointed out in the comments: first order of business should be to see if you can simply facilitate your monitoring requirement from node_exporter.
Below is a more generic answer on collecting metrics from arbitrary shell commands.
Prometheus is a pull-based monitoring system. You configure it with "scrape targets": these are effectively just HTTP endpoints that expose metrics in a specific format. Some target needs to be alive for long enough to allow it to be scraped.
The two most obvious options you have are:
Wrap your logic in a long-running process that exposes this metric on an HTTP endpoint, and configure it as a scrape target
Spin up an instance of pushgateway, and configure it as a scrape target , and have your command push its metrics there
Based on the little information you provided, the latter option seems like the most sane one. Important and relevant note from the README:
The Prometheus Pushgateway exists to allow ephemeral and batch jobs to expose their metrics to Prometheus. Since these kinds of jobs may not exist long enough to be scraped, they can instead push their metrics to a Pushgateway. The Pushgateway then exposes these metrics to Prometheus.
Your command would look something like:
#!/bin/bash
printf "mount_path_up %d" $(df -h | grep ip_of_my_nfs_server | wc -l) | curl --data-binary #- http://pushgateway.example.org:9091/metrics/job/some_job_name

Use Kubernetes to deploy a single app to multiple servers

I'd like to deploy a single app to multiple servers in one time.
I'm using Kubernetes and K3S to easily deploy containers.
Basically, I have a master server that I run and multiple servers that are localed in my customers facilities.
Master server was initialized with the following command:
k3sup install \
--ip $MASTER_IP \
--user ubuntu \
--cluster --k3s-channel latest \
--k3s-extra-args "--node-label ols.role=master"
Customer's servers were launched with:
k3sup join \
--ip $WORKER01_IP \
--user ubuntu \
--server-ip $MASTER_IP \
--server-user ubuntu \
--k3s-channel latest \
--k3s-extra-args "--node-label ols.role=worker"
When I want to deploy a new web service on each customer's server, I've tried the following code:
helm install node-red k8s-at-home/node-red --set nodeSelector."ols\.role"=worker
Problem: Only one single pod is deployed.
What I'd like is to deploy a single pod on each server and make it independent.
Is there a way to do that ?
Here there are two different things that we need to consider.
If the requirement is just to run more number of replicas of the application a change to the deployment template in the helm chart or through values you can pass number of minimum replicas need to be working in the cluster.
Reference documentation for deployments
Coming to next thing, if the requirements is just to run application across all the nodes existing in the cluster, Daemonsets is the workload which gives the capability to run across all the existing nodes.
Reference documentation for daemonsets
Again if you are using helm to deploy, appropriate templates for either daemonsets or deployments need to be added or modified based on the existing contents of the helm chart.
There are also different workloads k8s supports so based on requirements they can be picked appropriately.

Prometheus metrics Configuration

I'm pretty new to Prometheus and according to my understanding, there are many metrics already available in Prometheus. But I'm not able to see "http_requests_total" which is used in many examples in the list. Do we need to configure anything in order to avail these HTTP metrics?
My requirement is to calculate the no: of HTTP requests hitting the server at a time. So http_request_total or http_requests_in_flight metrics would be of great help for usage.
Can someone please guide me here on what to do next?
The documentation is extensive and helpful.
See installation
If you have Docker, you can simply run:
docker run \
--interactive --tty --rm \
--publish=9090:9090 \
prom/prometheus
And then browse: http://localhost:9090.
The default config is set to scrape itself.
You can list these metrics.
And graph prometheus_http_requests_total them.

How to set up federation between two Kubernetes clusters?

I want to set up federation between clusters but because of the differences in the documentation both on the Kubernetes website and also federation repo docs I am a little confused.
On the website it is mentioned that "Use of Federation v1 is strongly discouraged." but their own link is pointing to v1 releases (v1.10.0-alpha.0, v1.9.0-alpha.3, v1.9.0-beta.0) and the latest release there is 2 years old:
v1.10.0-alpha.0:
federation-client-linux-amd64.tar.gz 11.47 MB application/x-tar Standard 2/20/18, 8:44:21 AM UTC+1
federation-client-linux-amd64.tar.gz.sha 103 B application/octet-stream Standard 2/20/18, 8:44:20 AM UTC+1
federation-server-linux-amd64.tar.gz 131.05 MB application/x-tar Standard 2/20/18, 8:44:23 AM UTC+1
federation-server-linux-amd64.tar.gz.sha 103 B application/octet-stream Standard 2/20/18, 8:44:20 AM UTC+1
On the other hand, I followed the instruction at installation and I installed kubefedctl-0.1.0-rc6-linux-amd64.tgz but it doesn't have any init command which mentioned in the official Kubernetes website.
Kubernetes website:
kubefed init fellowship \
--host-cluster-context=rivendell \
--dns-provider="google-clouddns" \
--dns-zone-name="example.com."
Latest release kubefedctl help:
$kubefedctl -h
kubefedctl controls a Kubernetes Cluster Federation. Find more information at https://sigs.k8s.io/kubefed.
Usage:
kubefedctl [flags]
kubefedctl [command]
Available Commands:
disable Disables propagation of a Kubernetes API type
enable Enables propagation of a Kubernetes API type
federate Federate creates a federated resource from a kubernetes resource
help Help about any command
join Register a cluster with a KubeFed control plane
orphaning-deletion Manage orphaning delete policy
unjoin Remove the registration of a cluster from a KubeFed control plane
version Print the version info
Flags:
--alsologtostderr log to standard error as well as files
-h, --help help for kubefedctl
--log-backtrace-at traceLocation when logging hits line file:N, emit a stack trace (default :0)
--log-dir string If non-empty, write log files in this directory
--log-file string If non-empty, use this log file
--log-flush-frequency duration Maximum number of seconds between log flushes (default 5s)
--logtostderr log to standard error instead of files (default true)
--skip-headers If true, avoid header prefixes in the log messages
--stderrthreshold severity logs at or above this threshold go to stderr
-v, --v Level number for the log level verbosity
--vmodule moduleSpec comma-separated list of pattern=N settings for file-filtered logging
Use "kubefedctl [command] --help" for more information about a command.
And then there is the helm chart which says "It builds on the sync controller (a.k.a. push reconciler) from Federation v1 to iterate on the API concepts laid down in the brainstorming doc and further refined in the architecture doc." Therefore if I am not wrong it means the helm chart is based on Federation v1 which is Deprecated.
Also, the userguid on the repo is not helpful in this case. It shows how to "Enables propagation of a Kubernetes API type" but nothing about setting up a host cluster (something equal to kubefed init).
Can someone please let me know how can I set up a federated multi-cluster Kubernetes on bare metal and join another cluster to it?

do i need kubernertes-cadvisor up to monitor kubernetes

I've setup Prometheus to monitor Kubernetes. However when i watch the Prometheus dashboard I see kubernetes-cadvisor DOWN
I would want to know if we need it to monitor Kubernetes because on Grafana i already get different information as memory usage, disk space ...
Would it be used to monitor containers in order to make precise requests such as the use of memory used by a pod of a specific namespace?
The error you have provided means that the cAdvisor's content does not comply with the Prometheus exposition format.[1] But to be honest, it is one of the possibilities and as you did not provide more information we will have to leave it for now (I mean the information asked by Oliver + versions of Prometheus and Grafana and environment in which you are running the cluster).
Answering your question, although you don't need to use cAdvisor for monitoring, it does provide some important metrics and is pretty well integrated with Kubernetes. So until you need container level metrics, then you should use cAdvisor.
As specified in this article(you can find configuration tutorial there):
you can’t access cAdvisor directly (through 4194). You can (!) access
cAdvisor by duplicating the job_name (called “k8s”) in the
prometheus.yml file, calling the copy “cAdvisor” (perhaps) and
inserting an additional line to define “metrics_path”. Prometheus
assumes exporters are on “/metrics” but, for cAdvisor, our metrics are
on “/metrics/cadvisor”.
I think that could be the reason, but if this does not solve your issue I will try to recreate it in my cluster.
Update:
Judging from your yaml file, you did not configure Prometheus to scrape metrics from the cAdvisor. Add this to your yaml file:
scrape_configs:
- job_name: cadvisor
scrape_interval: 5s
static_configs:
- targets:
- cadvisor:8080
As specified here.
To get the metrics of container we need CADVISOR !!
to setup it i just follow the procedure below
https://github.com/google/cadvisor
i installed it on each of my nodes !
i run on each
sudo docker run \
--volume=/:/rootfs:ro \
--volume=/var/run:/var/run:ro \
--volume=/sys:/sys:ro \
--volume=/var/lib/docker/:/var/lib/docker:ro \
--volume=/dev/disk/:/dev/disk:ro \
--publish=8080:8080 \
--detach=true \
--name=cadvisor \
google/cadvisor:latest
i hope this will help you guys ;)