Kafka Monitoring: request latencies from JMX - apache-kafka

We want to monitor Kafka and have two specific requirements: use headless tools and store performance metrics in a CSV file. Following Gwen Shapira series [1] I am leaning towards request latencies and kafka.tools.JmxTool to start with.
Setup: Kafka 0.11, exposed JMX, headless metric collection tools
Q: what JMX beans provide metrics as presented on [2], likely per Broker: “request queue”, “request local”, “response remote”, “response queue”, “response send”?
[1] slides
https://www.slideshare.net/ConfluentInc/metrics-are-not-enough-monitoring-apache-kafka-and-streaming-applications/
[2] desired Kafka metrics

After some exploring, here is the full set to open Kafka JMX at port 3999 and gather the request metrics:
1 Open Kafka JMX port at 3999:
update bin/kafka-run-class.sh:
# JMX settings
if [ -z "$KAFKA_JMX_OPTS" ]; then
KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=0.0.0.0 -Djava.net.preferIPv4Stack=true"
fi
# JMX port to use
if [ $JMX_PORT ]; then
KAFKA_JMX_OPTS="$KAFKA_JMX_OPTS -Dcom.sun.management.jmxremote.port=${JMX_PORT} -Dcom.sun.management.jmxremote.rmi.port=${JMX_PORT} "
fi
update bin/kafka-server-start.sh:
if [ -z "$JMX_PORT" ]; then
export JMX_PORT=3999
fi
2 bash script to gather Kafka metrics and publish them to the journalctl:
#!/bin/bash
PIPE=/tmp/kafka-monitoring-temp.out
mkfifo $PIPE
# Start logging to journal
systemd-cat -t 'kafka-monitoring' < $PIPE &
sleep_pid=$(
sleep 9999d > $PIPE & # keep pipe open
echo $! # but allow us to close it later...
)
arrayName=( "kafka.network:type=RequestMetrics,name=RequestQueueTimeMs,request=Fetch"
"kafka.network:type=RequestMetrics,name=LocalTimeMs,request=Fetch"
"kafka.network:type=RequestMetrics,name=RemoteTimeMs,request=Fetch"
"kafka.network:type=RequestMetrics,name=ResponseQueueTimeMs,request=Fetch"
"kafka.network:type=RequestMetrics,name=ResponseSendTimeMs,request=Fetch"
)
for name in "${arrayName[#]}"; do
timeout 1s /ust/lib/kafka/bin/kafka-run-class.sh kafka.tools.JmxTool --object-name "${name}" --jmx-url service:jmx:rmi:///jndi/rmi://127.0.0.1:3999/jmxrmi --reporting-interval 1100 | tee $PIPE
done
kill $sleep_pid
rm $PIPE

There is a good chapter on Monitoring in the "Kafka: The Definitive Guide" (pdf is freely available from Confluent's site). The book shows following Request-related metrics:

Here they are in addition to some description of what they mean. I hope it helps.
Request Queue: time that request waits at the request queue
kafka.network:type=RequestChannel,name=RequestQueueSizeMs
Request Local: time processing the request at the leader
kafka.network:type=RequestMetrics,name=LocalTimeMs,request=Produce
Response Remote: time waiting for the followers to process the request
kafka.network:type=RequestMetrics,name=RemoteTimeMs,request=Produce
Response Queue: time that the request waits at the response queue
kafka.network:type=RequestMetrics,name=ResponseQueueTimeMs,request=Produce
Response Send: time to send the response
kafka.network:type=RequestMetrics,name=ResponseSendTimeMs

Related

launching multiple kafka broker fails

The repo used is: https://github.com/Yolean/kubernetes-kafka/
So I'm trying to run a Kafka cluster that connects to a Zookeeper cluster in Kubernetes, the first pod runs alright, but then the second Kafka pod tries to connect to the zookeeper cluster and it has this error:
kafka.common.InconsistentBrokerIdException: Configured broker.id 1
doesn't match stored broker.id 0 in meta.properties. If you moved your
data, make sure your configured broker.id matches. If you intend to
create a new broker, you should remove all data in your data
directories (log.dirs).
I understand the error is in the second broker id but shouldn't the zookeeper cluster allow multiple broker connections? or how could the config be changed to allow it?
or is it a Kafka configuration problem? The config file is:
kind: ConfigMap
metadata:
name: broker-config
namespace: whitenfv
labels:
name: kafka
system: whitenfv
apiVersion: v1
data:
init.sh: |-
#!/bin/bash
set -x
cp /etc/kafka-configmap/log4j.properties /etc/kafka/
KAFKA_BROKER_ID=${HOSTNAME##*-}
SEDS=("s/#init#broker.id=#init#/broker.id=$KAFKA_BROKER_ID/")
LABELS="kafka-broker-id=$KAFKA_BROKER_ID"
ANNOTATIONS=""
hash kubectl 2>/dev/null || {
SEDS+=("s/#init#broker.rack=#init#/#init#broker.rack=# kubectl not found in path/")
} && {
ZONE=$(kubectl get node "$NODE_NAME" -o=go-template='{{index .metadata.labels "failure-domain.beta.kubernetes.io/zone"}}')
if [ $? -ne 0 ]; then
SEDS+=("s/#init#broker.rack=#init#/#init#broker.rack=# zone lookup failed, see -c init-config logs/")
elif [ "x$ZONE" == "x<no value>" ]; then
SEDS+=("s/#init#broker.rack=#init#/#init#broker.rack=# zone label not found for node $NODE_NAME/")
else
SEDS+=("s/#init#broker.rack=#init#/broker.rack=$ZONE/")
LABELS="$LABELS kafka-broker-rack=$ZONE"
fi
OUTSIDE_HOST=$(kubectl get node "$NODE_NAME" -o jsonpath='{.status.addresses[?(#.type=="InternalIP")].address}')
if [ $? -ne 0 ]; then
echo "Outside (i.e. cluster-external access) host lookup command failed"
else
OUTSIDE_PORT=3240${KAFKA_BROKER_ID}
SEDS+=("s|#init#advertised.listeners=OUTSIDE://#init#|advertised.listeners=OUTSIDE://${OUTSIDE_HOST}:${OUTSIDE_PORT}|")
ANNOTATIONS="$ANNOTATIONS kafka-listener-outside-host=$OUTSIDE_HOST kafka-listener-outside-port=$OUTSIDE_PORT"
fi
if [ ! -z "$LABELS" ]; then
kubectl -n $POD_NAMESPACE label pod $POD_NAME $LABELS || echo "Failed to label $POD_NAMESPACE.$POD_NAME - RBAC issue?"
fi
if [ ! -z "$ANNOTATIONS" ]; then
kubectl -n $POD_NAMESPACE annotate pod $POD_NAME $ANNOTATIONS || echo "Failed to annotate $POD_NAMESPACE.$POD_NAME - RBAC issue?"
fi
}
printf '%s\n' "${SEDS[#]}" | sed -f - /etc/kafka-configmap/server.properties > /etc/kafka/server.properties.tmp
[ $? -eq 0 ] && mv /etc/kafka/server.properties.tmp /etc/kafka/server.properties
server.properties: |-
############################# Log Basics #############################
# A comma seperated list of directories under which to store log files
# Overrides log.dir
log.dirs=/var/lib/kafka/data/topics
# The default number of log partitions per topic. More partitions allow greater
# parallelism for consumption, but this will also result in more files across
# the brokers.
num.partitions=1
default.replication.factor=3
min.insync.replicas=2
auto.create.topics.enable=true
# The number of threads per data directory to be used for log recovery at startup and flushing at shutdown.
# This value is recommended to be increased for installations with data dirs located in RAID array.
#num.recovery.threads.per.data.dir=1
############################# Server Basics #############################
# The id of the broker. This must be set to a unique integer for each broker.
#init#broker.id=#init#
#init#broker.rack=#init#
############################# Socket Server Settings #############################
# The address the socket server listens on. It will get the value returned from
# java.net.InetAddress.getCanonicalHostName() if not configured.
# FORMAT:
# listeners = listener_name://host_name:port
# EXAMPLE:
# listeners = PLAINTEXT://your.host.name:9092
#listeners=PLAINTEXT://:9092
listeners=OUTSIDE://:9094,PLAINTEXT://:9092
# Hostname and port the broker will advertise to producers and consumers. If not set,
# it uses the value for "listeners" if configured. Otherwise, it will use the value
# returned from java.net.InetAddress.getCanonicalHostName().
#advertised.listeners=PLAINTEXT://your.host.name:9092
#init#advertised.listeners=OUTSIDE://#init#,PLAINTEXT://:9092
# Maps listener names to security protocols, the default is for them to be the same. See the config documentation for more details
#listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL
listener.security.protocol.map=PLAINTEXT:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL,OUTSIDE:PLAINTEXT
inter.broker.listener.name=PLAINTEXT
# The number of threads that the server uses for receiving requests from the network and sending responses to the network
#num.network.threads=3
# The number of threads that the server uses for processing requests, which may include disk I/O
#num.io.threads=8
# The send buffer (SO_SNDBUF) used by the socket server
#socket.send.buffer.bytes=102400
# The receive buffer (SO_RCVBUF) used by the socket server
#socket.receive.buffer.bytes=102400
# The maximum size of a request that the socket server will accept (protection against OOM)
#socket.request.max.bytes=104857600
############################# Internal Topic Settings #############################
# The replication factor for the group metadata internal topics "__consumer_offsets" and "__transaction_state"
# For anything other than development testing, a value greater than 1 is recommended for to ensure availability such as 3.
#offsets.topic.replication.factor=1
#transaction.state.log.replication.factor=1
#transaction.state.log.min.isr=1
############################# Log Flush Policy #############################
# Messages are immediately written to the filesystem but by default we only fsync() to sync
# the OS cache lazily. The following configurations control the flush of data to disk.
# There are a few important trade-offs here:
# 1. Durability: Unflushed data may be lost if you are not using replication.
# 2. Latency: Very large flush intervals may lead to latency spikes when the flush does occur as there will be a lot of data to flush.
# 3. Throughput: The flush is generally the most expensive operation, and a small flush interval may lead to excessive seeks.
# The settings below allow one to configure the flush policy to flush data after a period of time or
# every N messages (or both). This can be done globally and overridden on a per-topic basis.
# The number of messages to accept before forcing a flush of data to disk
#log.flush.interval.messages=10000
# The maximum amount of time a message can sit in a log before we force a flush
#log.flush.interval.ms=1000
############################# Log Retention Policy #############################
# The following configurations control the disposal of log segments. The policy can
# be set to delete segments after a period of time, or after a given size has accumulated.
# A segment will be deleted whenever *either* of these criteria are met. Deletion always happens
# from the end of the log.
# https://cwiki.apache.org/confluence/display/KAFKA/KIP-186%3A+Increase+offsets+retention+default+to+7+days
offsets.retention.minutes=10080
# The minimum age of a log file to be eligible for deletion due to age
log.retention.hours=-1
# A size-based retention policy for logs. Segments are pruned from the log unless the remaining
# segments drop below log.retention.bytes. Functions independently of log.retention.hours.
#log.retention.bytes=1073741824
# The maximum size of a log segment file. When this size is reached a new log segment will be created.
#log.segment.bytes=1073741824
# The interval at which log segments are checked to see if they can be deleted according
# to the retention policies
#log.retention.check.interval.ms=300000
############################# Zookeeper #############################
# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
zookeeper.connect=zoo-0.zoo:2181,zoo-1.zoo:2181,zoo-2.zoo:2181
# Timeout in ms for connecting to zookeeper
#zookeeper.connection.timeout.ms=6000
############################# Group Coordinator Settings #############################
# The following configuration specifies the time, in milliseconds, that the GroupCoordinator will delay the initial consumer rebalance.
# The rebalance will be further delayed by the value of group.initial.rebalance.delay.ms as new members join the group, up to a maximum of max.poll.interval.ms.
# The default value for this is 3 seconds.
# We override this to 0 here as it makes for a better out-of-the-box experience for development and testing.
# However, in production environments the default value of 3 seconds is more suitable as this will help to avoid unnecessary, and potentially expensive, rebalances during application startup.
#group.initial.rebalance.delay.ms=0
log4j.properties: |-
# Unspecified loggers and loggers with additivity=true output to server.log and stdout
# Note that INFO only applies to unspecified loggers, the log level of the child logger is used otherwise
log4j.rootLogger=INFO, stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=[%d] %p %m (%c)%n
log4j.appender.kafkaAppender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.kafkaAppender.DatePattern='.'yyyy-MM-dd-HH
log4j.appender.kafkaAppender.File=${kafka.logs.dir}/server.log
log4j.appender.kafkaAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.kafkaAppender.layout.ConversionPattern=[%d] %p %m (%c)%n
log4j.appender.stateChangeAppender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.stateChangeAppender.DatePattern='.'yyyy-MM-dd-HH
log4j.appender.stateChangeAppender.File=${kafka.logs.dir}/state-change.log
log4j.appender.stateChangeAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.stateChangeAppender.layout.ConversionPattern=[%d] %p %m (%c)%n
log4j.appender.requestAppender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.requestAppender.DatePattern='.'yyyy-MM-dd-HH
log4j.appender.requestAppender.File=${kafka.logs.dir}/kafka-request.log
log4j.appender.requestAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.requestAppender.layout.ConversionPattern=[%d] %p %m (%c)%n
log4j.appender.cleanerAppender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.cleanerAppender.DatePattern='.'yyyy-MM-dd-HH
log4j.appender.cleanerAppender.File=${kafka.logs.dir}/log-cleaner.log
log4j.appender.cleanerAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.cleanerAppender.layout.ConversionPattern=[%d] %p %m (%c)%n
log4j.appender.controllerAppender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.controllerAppender.DatePattern='.'yyyy-MM-dd-HH
log4j.appender.controllerAppender.File=${kafka.logs.dir}/controller.log
log4j.appender.controllerAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.controllerAppender.layout.ConversionPattern=[%d] %p %m (%c)%n
log4j.appender.authorizerAppender=org.apache.log4j.DailyRollingFileAppender
log4j.appender.authorizerAppender.DatePattern='.'yyyy-MM-dd-HH
log4j.appender.authorizerAppender.File=${kafka.logs.dir}/kafka-authorizer.log
log4j.appender.authorizerAppender.layout=org.apache.log4j.PatternLayout
log4j.appender.authorizerAppender.layout.ConversionPattern=[%d] %p %m (%c)%n
# Change the two lines below to adjust ZK client logging
log4j.logger.org.I0Itec.zkclient.ZkClient=INFO
log4j.logger.org.apache.zookeeper=INFO
# Change the two lines below to adjust the general broker logging level (output to server.log and stdout)
log4j.logger.kafka=INFO
log4j.logger.org.apache.kafka=INFO
# Change to DEBUG or TRACE to enable request logging
log4j.logger.kafka.request.logger=WARN, requestAppender
log4j.additivity.kafka.request.logger=false
# Uncomment the lines below and change log4j.logger.kafka.network.RequestChannel$ to TRACE for additional output
# related to the handling of requests
#log4j.logger.kafka.network.Processor=TRACE, requestAppender
#log4j.logger.kafka.server.KafkaApis=TRACE, requestAppender
#log4j.additivity.kafka.server.KafkaApis=false
log4j.logger.kafka.network.RequestChannel$=WARN, requestAppender
log4j.additivity.kafka.network.RequestChannel$=false
log4j.logger.kafka.controller=TRACE, controllerAppender
log4j.additivity.kafka.controller=false
log4j.logger.kafka.log.LogCleaner=INFO, cleanerAppender
log4j.additivity.kafka.log.LogCleaner=false
log4j.logger.state.change.logger=TRACE, stateChangeAppender
log4j.additivity.state.change.logger=false
# Change to DEBUG to enable audit log for the authorizer
log4j.logger.kafka.authorizer.logger=WARN, authorizerAppender
log4j.additivity.kafka.authorizer.logger=false
As per this: Launching multiple Kafka brokers fails, it's an issue with log.dirs in your server.properties where it can't be the same for all your brokers or it can't be shared.
You can probably use the ${HOSTNAME##*-} bash environment setting to modify your container entrypoint script that in of itself modifies your server.properties before the start, but the downside of that is that you are going to have to rebuild your Docker image.
Another strategy using StatefulSets is described here: How to pass args to pods based on Ordinal Index in StatefulSets?. But you will also have to make changes on how the Kafka entrypoint is called.
You could also try using completely different volumes for each of your Kafka broker pods.
First you must see the server configuration in the server.properties file.
~/kafka_2.11-2.1.0/bin$ egrep -v '^#|^$' ../config/server.properties
broker.id=0
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/tmp/kafka-logs
...
Here you can see an attribute called log.dirs and a directory /tmp/kafka-logs as a value. Make sure that the directory has the right permissions for the user you are using to start the Kafka process.
~/kafka_2.11-2.1.0/bin$ ls -lrtd /tmp/kafka-logs
drwxr-xr-x 2 kafkauser kafkauser 4096 mar 1 08:26 /tmp/kafka-logs
Rremove all files under /tmp/kafka-logs
~/kafka_2.11-2.1.0/bin$ rm -fr /tmp/kafka-logs/*
And finally try again. Probably your problem is solved.

filebeat-kafka:WARN producer/broker/0 maximum request accumulated, waiting for space

when filebeat output data to kafka , there are many warning message in filebeat log.
..
*WARN producer/broker/0 maximum request accumulated, waiting for space
*WARN producer/broker/0 maximum request accumulated, waiting for space
..
nothing special in my filebeat config:
..
output.kafka:
hosts: ["localhost:9092"]
topic: "log-oneday"
..
i have also updated these socket setting in kafka:
...
socket.send.buffer.bytes=10240000
socket.receive.buffer.bytes=10240000
socket.request.max.bytes=1048576000
queued.max.requests=1000
...
but it did not work.
is there something i missing? or i have to increase those number bigger?
besides , no error or exception found in kafka server log
is there any expert have any idea about this ?
thanks
Apparently you have only one partition in your topic. Try to increase partitions for the topic. See the links below for more information.
More Partitions Lead to Higher Throughput
https://www.confluent.io/blog/how-to-choose-the-number-of-topicspartitions-in-a-kafka-cluster/
https://kafka.apache.org/documentation/#basic_ops_modify_topic
Try the following command (replacing info with your particular use case):
bin/kafka-topics.sh --zookeeper zk_host:port/chroot --alter --topic my_topic_name --partitions 40
You need to configure 3 things:
Brokers
Filebeat kafka output
Consumer
Here a example (change paths according your environment).
Broker configuration:
# open kafka server configuration file
vim /opt/kafka/config/server.properties
# add this line
# The largest record batch size allowed by Kafka.
message.max.bytes=100000000
# restart kafka service
systemctl restart kafka.service
Filebeat kafka output:
output.kafka:
...
max_message_bytes: 100000000
Consumer configuration:
# larger than the max.message.size
max.partition.fetch.bytes=200000000

Kafka: Connection issues - All servers failed to process request"

Setup
We have a 3 node kafka cluster, processing messages coming in through nginx. The nginx hands it off to a php which in turn forks a python process and calls the KafkaClient, SimpleProducer & Send_Message
The zookeeper is running on the same host as kafka, nginx is on a separate host. The ports 2181, 2182, 3888, 9092 are all open. No errors seen in starting zookeeper, kafka. All this setup is on AWS in the same vpc.
Kafka & Zookeeper is running as kafka user, Nginx is running as nginx, php-fpm running as apache
Versions
Kafka: 0.8.2
Python: 2.7.5
Relevant snippets from property files.
zookeeper.properties
dataDir=/data/zookeeper
clientPort=2181
maxClientCnxns=100
tickTime=2000
initLimit=5
syncLimit=2
server.1=172.31.41.78:2888:3888
server.2=172.31.45.245:2888:3888
server.3=172.31.23.101:2888:3888
producer.properties
metadata.broker.list=172.31.41.78:9092,172.31.45.245:9092,172.31.23.101:9092
consumer.properties
zookeeper.connect=172.31.41.78:2181,172.31.45.245:2181,172.31.23.101:2181
zookeeper.connection.timeout.ms=6000
server.properties (setup with appropriate IP on other machines)
port=9092
advertised.host.name=172.31.41.78
host.name=172.31.41.78
zookeeper.connect=172.31.41.78:2181,172.31.45.245:2181,172.31.23.101:2181
zookeeper.connection.timeout.ms=60000
php code
function sendDataToKafka($_data,$_srcType) {
try{
$pyKafka = "/usr/bin/python /etc/nginx/html/postMon.py ".$_srcType;
$dspec = array(
0 => array("pipe","r"),
1 => array("pipe","w"),
2 => array("file","/dev/null", "a")
);
$process = proc_open($pyKafka,$dspec,$pipes);
if (is_resource($process)) {
if(fwrite($pipes[0],$_data) == true){
fclose($pipes[0]);
echo stream_get_contents($pipes[1]);
fclose($pipes[1]);
proc_close($process);
echo "Process completed";
python code
import sys,json,time,ConfigParser
import traceback
sys.path.append("/etc/nginx/html/kafka_python-0.9.4-py2.7.egg")
from kafka import KafkaClient,SimpleProducer
try:
srcMap = {
'Alert' : 'alerts'
}
topic = srcMap.get(sys.argv[1],'events')
data = ''
data = 'Testing static Kafka message'
print 'Host: 172.31.23.101:9092'
kafka = KafkaClient("172.31.23.101:9092")
producer = SimpleProducer(kafka,random_start=True)
producer.send_messages(topic,data);
except Exception as e: # most generic exception you can catch
print str(e)
Scenarios
Scenario 1:
Running a
bin/kafka-console-producer.sh --zookeeper 172.31.41.78:2181,172.31.45.245:2181,172.31.23.101:2181 --topic alerts
on 1 shell
and
running
./kafka-console-consumer.sh --zookeeper 172.31.41.78:2181,172.31.45.245:2181,172.31.23.101:2181 --topic alerts
we are able to view messages
Scenario 2:
Running the python code command line (from the nginx host), able to view messages from the consumer
Scenario 3:
Running the php code command line (from nginx host), able to view messages from the consumer
Scenario 4:
Running from a REST client (as POSTMAN)/ CURL using the REST URL, get the following message:
<html>
<body>
Host: 172.31.23.101:9092
All servers failed to process request
<pre>Process completed</pre></body>
<html>
This shows, traffic going to nginx, nginx executing the php & python scripts, but erroring out when the first call to Kafka is made - KafkaClient happens. Somehow the python is unable to access Kafka.
Don't know if this is user permission/ silly config mistake.
Also......
We have a similar working setup in another vpc
The security groups, config files, codebase properties etc. are consistent
Upgrade options are not a possibility in near term
Any pointers/help/fresh pair of eyes would really help us going.
Thanks !
Finally figured the apache user did not have "right" permission.
selinuxconlist apache helped fix the issue.

How to enable remote JMX on Kafka brokers (for JmxTool)?

I enabled JMX on Kafka brokers by adding
KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false
-Djava.rmi.server.hostname=<server_IP>
-Djava.net.preferIPv4Stack=true"
However, when I use kafka.tools.JmxTool to get the JMX metrics, it outputs Unix timestamps only. Why?
./bin/kafka-run-class.sh kafka.tools.JmxTool \
--object-name 'kafka.server:type=BrokerTopicMetrics,name=AllTopicsMessagesInPerSec' \
--jmx-url "service:jmx:rmi:///jndi/rmi://<server_IP>:9111/jmxrmi"
How can I have it print out the metrics?
Edit bin/kafka-run-class.sh and set KAFKA_JMX_OPTS variable
KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=your.kafka.broker.hostname -Djava.net.preferIPv4Stack=true"
Update bin/kafka-server-start.sh add the below line
export JMX_PORT=PORT
You must set 'JMX_PORT' variable, or add the following line to bin/kafka-server-start.sh.
export JMX_PORT=${JMX_PORT:-9999}
then you will be able to connect to Kafka JMX metrics. I use jconsole tool and 'localhost:9999' address.
Setting JMX_PORT inside bin/kafka-run-class.sh will clash with Zookeeper, if you are running Zookeeper on the same node.
Best is to set JMX port individually inside corresponding server-start scripts:
Insert line “export JMX_PORT=${JMX_PORT:-9998}” before last line in $KAFKA_HOME/bin/zookeeper-server-start.sh file.
Restart the Zookeeper server.
Repeat steps 1 and 2 for all zookeeper nodes in the cluster.
Insert line “export JMX_PORT=${JMX_PORT:-9999}” before last line in $KAFKA_HOME/bin/kafka-server-start.sh file.
Restart the Kafka Broker.
Repeat steps 4 and 5 for all brokers in the cluster.
If you're running via systemd:
edit /etc/systemd/system/multi-user.target.wants/kafka.service
in the "[service]" section add a line:
Environment=JMX_PORT=9989
reload: systemctl daemon-reload
restart: systemctl restart kafka
enjoy the beans: echo 'beans' | java -jar jmxterm-1.0-alpha-4-uber.jar -l localhost:9989 -n 2>&1
This is Kafka 2.3.0.
Using jconsole For Available MBeans
You should use jconsole first to know the names of the MBeans available.
The proper name of the MBean you wanted to query metrics of is kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec (the AllTopics prefix was used in older verions). Thanks AndyTheEntity.
Enabling Remote JMX (with no authentication or SSL)
As described in Monitoring and Management Using JMX Technology you should set certain system properties when you start the Java VM of a Kafka broker.
Kafka's bin/kafka-run-class.sh shell script makes the configuration painless as it does the basics for you and sets KAFKA_JMX_OPTS.
# JMX settings
if [ -z "$KAFKA_JMX_OPTS" ]; then
KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false "
fi
For remote JMX you should set com.sun.management.jmxremote.port that Kafka's bin/kafka-run-class.sh shell script sets using JMX_PORT environment variable.
# JMX port to use
if [ $JMX_PORT ]; then
KAFKA_JMX_OPTS="$KAFKA_JMX_OPTS -Dcom.sun.management.jmxremote.port=$JMX_PORT "
fi
With that, enabling remote JMX is as simple as the following command:
JMX_PORT=9999 ./bin/kafka-server-start.sh config/server.properties
Using JmxTool
With the above, run the JmxTool:
$ ./bin/kafka-run-class.sh kafka.tools.JmxTool \
--object-name 'kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec'
Trying to connect to JMX url: service:jmx:rmi:///jndi/rmi://:9999/jmxrmi.
"time","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:Count","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:EventType","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:FifteenMinuteRate","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:FiveMinuteRate","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:MeanRate","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:OneMinuteRate","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:RateUnit"
1567586728595,0,messages,0.0,0.0,0.0,0.0,SECONDS
1567586730597,0,messages,0.0,0.0,0.0,0.0,SECONDS
...
You could use --one-time option to print the JMX metrics just once.
$ ./bin/kafka-run-class.sh kafka.tools.JmxTool \
--object-name 'kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec' \
--one-time true
Trying to connect to JMX url: service:jmx:rmi:///jndi/rmi://:9999/jmxrmi.
"time","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:Count","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:EventType","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:FifteenMinuteRate","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:FiveMinuteRate","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:MeanRate","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:OneMinuteRate","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:RateUnit"
1567586898459,0,messages,0.0,0.0,0.0,0.0,SECONDS
vim kafka_2.11-0.10.1.1/bin/kafka-run-class.sh
and then add the first two lines and comment as I have done for other lines, (Note : after doing this Kafka scripts cannot be used for client operations for listing topics.. for your client operations you need to use a separate scripts , download again in different locations and use)
export JMX_PORT=9096
KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=<ipaddress> -Dcom.sun.management.jmxremote.port=$JMX_PORT -Dcom.sun.management.jmxremote.rmi.port=$JMX_PORT"
# JMX settings
#if [ -z "$KAFKA_JMX_OPTS" ]; then
# KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false "
#fi
# JMX port to use
#if [ $JMX_PORT ]; then
# KAFKA_JMX_OPTS="$KAFKA_JMX_OPTS -Dcom.sun.management.jmxremote.port=$JMX_PORT "
#fi
Use kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec
The AllTopics prefix was used in older verions. You can specify topic using kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec,topic=<topic-name>
src: http://grokbase.com/t/kafka/users/164ksnhff0/enable-jmx-on-kafka-brokers
Kafka has provided all you need. When your start your server, activate the KAFKA_JMX_OPTS arg by using these command:
$KAFKA_JMX_OPTS JMX_PORT=[your_port_number] ./kafka-server-start.sh -daemon ../config/server.properties
Using those command, you activated JMX Remote and related port. Then you can connect your JConsole or another monitoring tools.
Just before calling kafka-server-start.sh add following exports. It worked like a charm for my case. You can set desired port for JMX_PORT and you should set broker for $BROKER_IP part.
export JMX_PORT=9900
export KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=$BROKER_IP -Djava.net.preferIPv4Stack=true"
This is standard Kafka start procedure:
bin/kafka-server-start.sh config/server.properties
This is Kafka start procedure with JMX:
JMX_PORT=8004 bin/kafka-server-start.sh config/server.properties

Mosquitto not publishing on SYS topics

I am using mosquitto server as mqtt broker. I testing mosquitto for performance and I need to subscribe to $SYS hierarchy for some data like number of currently connected from $SYS/broker/clients/active topic. I have following mosquitto config file.
# Place your local configuration in /etc/mosquitto/conf.d/
#
# A full description of the configuration file is at
# /usr/share/doc/mosquitto/examples/mosquitto.conf.example
pid_file /var/run/mosquitto.pid
persistence true
persistence_location /var/lib/mosquitto/
log_dest file /var/log/mosquitto/mosquitto.log
include_dir /etc/mosquitto/conf.d
listener 1884
protocol websockets
listener 1883
sys_interval 1
max_connections -1
I am subscribing to $SYS/broker/clients/active topic like this
ubuntu#linux-box:/etc/mosquitto$ mosquitto_sub -d -t $SYS/broker/clients/active
Client mosqsub/28715-ip-172-31 sending CONNECT
Client mosqsub/28715-ip-172-31 received CONNACK
Client mosqsub/28715-ip-172-31 sending SUBSCRIBE (Mid: 1, Topic: /broker/clients/active, QoS: 0)
Client mosqsub/28715-ip-172-31 received SUBACK
Subscribed (mid: 1): 0
The sys_interval in config file is 1, but I am not receiving any updates on above subscription. I also tried this subscribing to some alternative topics, but still not receiving anything. Mosquitto server is hosted on an AWS micro instance with linux operating system.
$SYS is being interpreted by you shell as an environment variable and is being substituted before mosquito_sub sees it. You can see this has happened by the topic string it reports in the SUBSCRIBE log statement.
You should put quotes around the topic string:
$ mosquitto_sub -d -t '$SYS/broker/clients/active'