How can I measure the rate of Kafka producers and consumers in a Java application? Does Kafka broker provides any performance metrics?
Kafka itself exposes lots of performance metrics. At my employer, we use Prometheus to ingest + store those metrics, and a Grafana frontend for graphs + dashboards. We also instrument our apps, including our Java apps with the Prometheus library, and expose+scrape custom metrics to help us understand all aspects of our data pipelines and performance.
Related
I am going to implement a Snowflake Kafka connector with the continuous ingestion of data to target database snowflake.
What are the best practices for :
Kafka for its clusters
Kafka and its related parameters
Monitoring resources
Kafka for its clusters
Run at least 3 brokers
Kafka and its related parameters
That's too broad and has nothing to do with running a Connect cluster or implementing one. The defaults are mostly fine. You can find the production recommendations in the Kafka documentation.
Monitoring resources
Use JMX. https://docs.confluent.io/platform/current/kafka/monitoring.html
going to implement a Snowflake Kafka connector
Snowflake already has a connector... I'd start by forking rather than making your own
I'm new to Kafka. During study to kafka, I think monitoring consumer's lag is needed. When I search from google and docs, I found few ways.
Kafka - Prometheus - graphana
kafka - burrow - someDB - graphana
kafka - burrow_stat?(I can't understand what it is..)
kafka - datadog
what I want to ask is
document says that burrow is for monitoring, can I visualize like graph(dashboard)?
without other tools like graphana or kibana or datadog??
I just trying to get less pipeline steps. What should be the simple way to visualize consumer's lag?
If you are doing the setup in an organisation, datadog or prometheus is probably the way to go. You can capture other Kafka related metrics as well. These agents also have integrations with many other tools beside Kafka and will be a good common choice for monitoring.
If you are just doing it for personal POC type of a project and you just want to view the lag, I find CMAK very useful (https://github.com/yahoo/CMAK). This does not have historical data, but provides a good current visual state of Kafka cluster including lag.
For cluster wide metrics you can use kafka_exporter (https://github.com/danielqsj/kafka_exporter) which exposes some very useful cluster metrics(including consumer lag) and is easy to integrate with prometheus and visualize using grafana.
Burrow is extremely effective and specialised in monitoring consumer lag.Burrow is good at caliberating consumer offset and more importantly validate if the lag is malicious or not. It has integrations with pagerduty so that the alerts are pushed to the necessary parties.
https://community.cloudera.com/t5/Community-Articles/Monitoring-Kafka-with-Burrow-Part-1/ta-p/245987
What burrow has:
Non-threshold based lag monitoring algorithm capable to evaluate potential slow downs.
Integration with pagerduty
Exporters for prometheus, AppD etc for historical metrics
Pluggable UI
If you are looking for quick solution you can deploy burrow followed by the burrow front end https://github.com/GeneralMills/BurrowUI
Closed. This question is opinion-based. It is not currently accepting answers.
Closed 3 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I am new to kafka. We want to monitor and manage kafka topics. We tried different open source monitoring tools like
kafka-monitor
kafka-manager
Both tools are good. But we are unable to make a decision which should be included in our deployment stack. Which one is better and why, and in which scenario?
'kafka manager' from yahoo looks the older one and 'kafka monitor' from LinkedIn is newer one
Kafka Monitor-
Lenses
Lenses (ex Landoop) enhances Kafka with User Interface, streaming SQL engine and cluster monitoring. It enables faster monitoring of Kafka data pipelines.
They provide a free all-in-one docker (Lenses Box) which can serve a single broker for up to 25M messages. Note that this is recommended for development environments.
Cloudera SMM
Streams Messaging Manager is the solution for monitoring and managing clusters running Cloudera or Hortonworks kafka. It also comes with replication capability.
Confluent
Another option is Confluent Enterprise which is a Kafka distribution for production environments. It also includes Control Centre, which is a management system for Apache Kafka that enables cluster monitoring and management from a User Interface.
Yahoo CMAK (Cluster Manager for Apache Kafka, previously known as Kafka Manager)
Kafka Manager or CMAK is a tool for monitoring Kafka offering less functionality compared to the aforementioned tools.
KafDrop
KafDrop is a UI for monitoring Apache Kafka clusters. The tool displays information such as brokers, topics, partitions, and even lets you view messages. It is a lightweight application that runs on Spring Boot and requires very little configuration.
LinkedIn Burrow
Burrow is a monitoring companion for Apache Kafka that provides consumer lag checking as a service without the need for specifying thresholds. It monitors committed offsets for all consumers and calculates the status of those consumers on demand. An HTTP endpoint is provided to request status on demand, as well as provide other Kafka cluster information. There are also configurable notifiers that can send status out via email or HTTP calls to another service.
Kafka Tool
Kafka Tool is a GUI application for managing and using Apache Kafka clusters. It provides an intuitive UI that allows one to quickly view objects within a Kafka cluster as well as the messages stored in the topics of the cluster. It contains features geared towards both developers and administrators.
If you cannot afford licenses, then go for Yahoo Kafka Manager, LinkedIn Burrow or KafDrop. Confluent's and Landoop's products are the best out there, but unfortunately, they require licensing.
For more details, you can refer to my blog post Overview of UI Monitoring tools for Apache Kafka Clusters.
If you want to pay for licensing and Kafka cluster support, then you can use Confluent Control Center
Alternatively, the free route would be to use JMX exporters from Datadog and/or Prometheus/Influxdb (with Grafana dashboards) to see overall system health checks (CPU, network, memory, etc)... Much more information than what you get only by monitoring Kafka processes with Kafka tools
At my company, we used the Yahoo product, we investigated the LinkedIn product, and several others mentioned. My company ultimately chose to use Prometheus+Grafana. Everyone loves it and I'd highly recommend it.
There are two big advantages to Prometheus+Grafana. The first is it does full featured Kafka metrics ingestion+visualization+alerting but it's not limited to Kafka. While our initial needs were just to monitor Kafka, we also wanted metrics on HTTP servers+traffic, server utilization (cpu/ram/disk), and custom application level metrics. Prometheus handles all of the above. Secondly, Prometheus + Grafana are very high quality, well designed, and easy to use. A lot of other products in this space are old and complicated to work with. Prometheus + Grafana are both excellent to work with, they are very customizable, polished, and easy to use. Grafana has a very flashy + functional JavaScript interface that lets you make exactly the customized dashboards that you want. Prometheus has a very polished metric collection engine, storage engine, query language, and alerting system. Something like Yahoo Kafka Manager has much more limited functionality in all of these categories.
If you want to try Prometheus, you need to do two things:
1) install+configure the JMX->Prometheus exporter on your Kafka brokers:
https://github.com/prometheus/jmx_exporter
2) Setup a Prometheus server to collect metrics + and setup a Grafana dashboard to display the graphs that you want.
I'd also say that this is just for monitoring+dashboards+alerting. For management functions, you still need other tools.
The kafka-monitor is (despite the name) a load generation and reporting tool. Yahoo's kafka-manager is an overall monitoring tool.
Is there a way to monitor my kafka cluster using nagios? any working plugin, api or whatever to check: broker status, partition status, memory status, current offset and all valuable metrics from my cluster?
We are using Nagios to monitor Kafka JMX metrics (we use JMXeval, but you can use any of your favorite JMX monitoring script for Nagios) where we can find many useful metrics like memory, lag, number of offline partition, and so on.
I can highly recommend you to read this article about Kafka monitoring, where you can find many useful tips what you can monitor - https://blog.serverdensity.com/how-to-monitor-kafka/
Because JMX is by default disabled, you need enable it first. You can follow instruction on Enable JMX on Kafka Brokers
Beside some basic monitoring metrics like CPU , memory and network usage. Is there anyway that I can actually monitor the running Kafka application, such as number of messages in/out, stream throughput, stream size ...?
Thank you.
Kafka offers various metrics reporting in both the server and the client. See the Monitoring document for details.