Describe the bug
To monitor a Pulsar cluster, I used grafana and in the Pulsar JVM dashboard, Zookeeper Direct Memory is show No Data. While other dashboards have data such as Bookie, Broker, Recovery.
I want to know why Zookeeper Metrics has no data for direct memory. I researched the document but did not find information about zookeeper metrics. Can you help me?
Pulsar version 2.9.2
Zookeeper version 3.5.5
Dashboard Grafana
Zookeeper
Broker
Recovery
ZooKeeper does not use direct memory. The Grafana panel on the dashboard is generic and applies to all components, so for Broker, BookKeeper, and Recovery it displays direct memory since those components use direct memory. For ZooKeeper it shows No Data since ZooKeeper doesn't use direct memory.
Related
I am reading how pulsar client discover brokers and it looks like client will consult zookeeper for the broker discovery in the background periodically: https://www.bookstack.cn/read/pulsar-2.5.2-zh/a3656d4de1a283ca.md#9hybm8. could this be a bottleneck considering there are thousands of pulsar clients? I am not quite familiar with zookeeper but the zookeeper cluster does not scale well and only contains 2 or 3 hosts usually
Pulsar 2.5.2 is quite old.
The Pulsar Client asks to any of the Pulsar Brokers about the current owner for a topic.
The Pulsar Broker has a cache over ZooKeeper metadata, so this is usually not a problem.
Usually you can go with a 3 nodes ZooKeeper cluster for most of the Pulsar usecase. Keep an eye on Metrics in order to see if the ZooKeeper nodes are working well or are satu
We have Spring Boot applications deployed on OKD (The Origin Community Distribution of Kubernetes that powers Red Hat OpenShift). Without much tweaking by devops team, we got in prometheus scraped kafka consumer metrics from kubernetes-service-endpoints exporter job, as well as some producer metrics, but only for kafka connect api, not for standard kafka producer api. This is I guess a configuration for that job:
https://github.com/prometheus/prometheus/blob/master/documentation/examples/prometheus-kubernetes.yml
What is needed to change in scrape config in order to collect what's been missing?
This issue with micrometer is the source of the problem.
So, we could add jmx exporter, or wait for the issue resolution.
I’m a network guy trying to build my first Kafka --> Prometheus --> Grafana pipeline. My Kafka broker has a topic which is being populated by an external producer. That’s great. But I can’t figure out how to configure my Prometheus server to scrape data from that topic as a Consumer.
I should also say that my Kafka node is running on my host Ubuntu machine (not in a Docker container). I also am running an instance of JMX Exporter when I run Kafka. Here’s how I start up Kafka on the Ubuntu command line:
KAFKA_OPTS="$KAFKA_OPTS -javaagent:/home/me/kafka_2.11-2.1.1/jmx_prometheus_javaagent-0.6.jar=7071:/home/Me/kafka_2.11-2.1.1/kafka-0-8-2.yml" \
./bin/kafka-server-start.sh config/server.properties &
Okay. My Prometheus (also a host process, not the Docker container version) can successfully pull a lot of metrics off of my Kafka. So I just need to figure out how to get Prometheus to read the messages within my topic. And I wonder is those messages are already visible? My topic is called “vflow.sflow,” and when I look at “scrapeable” metrics that is available on Kafka (TCP 7071), I do see these metrics:
From http://localhost:7071/metrics:
kafka_cluster_partition_replicascount{partition="0",topic="vflow.sflow",} 1.0
kafka_cluster_partition_insyncreplicascount{partition="0",topic="vflow.sflow",} 1.0
kafka_log_logendoffset{partition="0",topic="vflow.sflow",} 1.5357405E7
kafka_cluster_partition_laststableoffsetlag{partition="0",topic="vflow.sflow",} 0.0
kafka_log_numlogsegments{partition="0",topic="vflow.sflow",} 11.0
kafka_cluster_partition_underminisr{partition="0",topic="vflow.sflow",} 0.0
kafka_cluster_partition_underreplicated{partition="0",topic="vflow.sflow",} 0.0
kafka_log_size{partition="0",topic="vflow.sflow",} 1.147821017E10
kafka_log_logstartoffset{partition="0",topic="vflow.sflow",} 0.0
“Partition 0,” “Log Size,” “Log End Offset”… all those things look promising… I guess?
But please bear in mind that I’m completely new to the Kafka/JMX/Prometheus ecosystem. Question: do the above metrics describe my “vflow.sflow” topic? Can I use them to configure Prometheus to actually read the messages within the topic?
If so, can someone recommend a good tutorial for this? I’ve been playing around with my Prometheus YAML config files, but all I manage to do is crash the Prometheus process when I do so. Yes, I have been reading the large amount of online documentation and forum posts out there. Its a lot of information to digest, and its very, very easy to invest hours in documentation which proves to be a dead end.
Any advice for a newbie like me? General advice like “you’re on the right track, next look at X” or “you obviously don’t understand Y, spend more time looking at Z” will be def appreciated. Thanks!
When you add that argument from the Kafka container, it scrapes the MBeans of the JMX metrics, not any actual topic data, since Prometheus isn't a Kafka consumer
From that JMX information, you'd see metrics such as message rate and replica counts
If you'd like to read topic data, the Kafka Connect framework could be used, and there's a plugin for Influx, Mongo, and Elasticsearch, which are all good Grafana sources. I'm not sure if there's a direct Kafka to Prometheus importer, but I think it would require using the PushGateway
I am trying to do this without using Confluent Control Center, since I do not have a license.
I am able to see the Kafka Broker metrics by using dcos task metrics details <broker-id> and see that all of these are already exposed on my DCOS Prometheus instance.
However, I do not see any consumer/producer metrics available on Prometheus, despite having some producers/consumer tasks on dcos.
Is there a process I can follow to expose kafka prodcuer/consumer metrics on dcos? I tried the following https://github.com/ibm-cloud-architecture/refarch-eda/blob/master/docs/kafka/monitoring.md .
But from my understanding we cannot use JMX on a Kafka instance hosted on DCOS (yet) (soruce: https://jira.mesosphere.com/browse/DCOS_OSS-3632?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel)
Any ideas?
You would have to add Prometheus JMX exporters to each of your Kafka Java processes, and then you would need to have the Prometheus server be able to scrape those. You would do this by downloading that JAR in each of the processes (containers?), then editing the KAFKA_OPTS environment varible to include the -javaagent option
AFAIK, this does not require setting up a remotely accessible JMX port.
Note: Control Center doesn't monitor JMX values. It uses Kafka MetricsReporters and Interceptors. Use of these interfaces, if you chose to write your own, or find others, doesn't require Control Center at all.
Is there a way to monitor my kafka cluster using nagios? any working plugin, api or whatever to check: broker status, partition status, memory status, current offset and all valuable metrics from my cluster?
We are using Nagios to monitor Kafka JMX metrics (we use JMXeval, but you can use any of your favorite JMX monitoring script for Nagios) where we can find many useful metrics like memory, lag, number of offline partition, and so on.
I can highly recommend you to read this article about Kafka monitoring, where you can find many useful tips what you can monitor - https://blog.serverdensity.com/how-to-monitor-kafka/
Because JMX is by default disabled, you need enable it first. You can follow instruction on Enable JMX on Kafka Brokers