We are trying to set up apache storm multi node cluster using 4 machines .
Below are their config files as we use them currently:
a) Zookeeper Server : 10.135.155.133(running on windows 7)
b) Nimbus Host : 10.135.158.22 (running on centos)
c) Supervisor1 : 10.135.156.63 (running on centos)
d) Supervisor2 : 10.135.156.162 (running on centos)
On zookeeper server running on windows we have following configuration for zoo.cfg
tickTime=2000
initLimit=10
syncLimit=5
dataDir=d:\\tmp\\zookeeper
clientPort=2181
On nimbus host we have following storm.yaml configuration:
storm.zookeeper.servers:
- "10.135.155.133"
nimbus.host: "nimbus1"
storm.local.dir: "/storm/apache-storm-1.1.0/lib/"
storm.zookeeper.port: 2181
nimbus1=10.135.158.22
On Supervisor 1 and Supervisor 2 we have following storm.yaml
storm.zookeeper.servers:
- "10.135.155.133"
nimbus.host: "nimbus1"
storm.local.dir: "/storm/apache-storm-1.1.0/lib/"
supervisor.slots.ports:
- 6704
- 6705
- 6706
- 6707
Now the problem is when we are running zookeeper ,nimbus host ,supervisor 1 and supervisor 2 .
We get only one of the supervisor on storm ui at a time when we refresh the storm ui we get both the supervisor alternatingly but only one is displayed at a time . They both have name localhost.
How can I get both the supervisor on storm ui at the same time .
What additional configurations can be done to achieve both the supervisor on the storm ui ?
Related
I have very simple setup. I am trying to start ZooKeeper (apache-zookeeper-3.6.1-bin) on two machines. I get following error when i do zookeeper status
/cygdrive/c/ZooKeeper/apache-zookeeper-3.6.1-bin/apache-zookeeper-3.6.1-bin
$ bin/zkServer.sh restart
ZooKeeper JMX enabled by default
Using config: C:\ZooKeeper\apache-zookeeper-3.6.1-bin\apache-zookeeper-3.6.1-bin\conf\zoo.cfg
ZooKeeper JMX enabled by default
Using config: C:\ZooKeeper\apache-zookeeper-3.6.1-bin\apache-zookeeper-3.6.1-bin\conf\zoo.cfg
Stopping zookeeper ... STOPPED
ZooKeeper JMX enabled by default
Using config: C:\ZooKeeper\apache-zookeeper-3.6.1-bin\apache-zookeeper-3.6.1-bin\conf\zoo.cfg
Starting zookeeper ... STARTED
kalsa#CO01EAP00000027 /cygdrive/c/ZooKeeper/apache-zookeeper-3.6.1-bin/apache-zookeeper-3.6.1-bin
$ bin/zkServer.sh status
ZooKeeper JMX enabled by default
Using config: C:\ZooKeeper\apache-zookeeper-3.6.1-bin\apache-zookeeper-3.6.1-bin\conf\zoo.cfg
cat: '/tmp/zookeeper/'$'\r''/myid': No such file or directory
clientPort not found and myid could not be determined. Terminating.
My Zoo.cfg
tickTime=5000
dataDir=/tmp/zookeeper/
clientPort=2181
initLimit=5
syncLimit=2
server.1=XYZ:2888:3888
server.2=ABC:2888:3888
I have proper IPs in place of XYZ and ABC.
I have created myid file created as well. Can someone let me know if i am missing anything obvious
enter image description here
The owner of datadir should be the zookeeper user. If not, change the owner.
I have a requirement as below:
Kafka needs to listen to multiple interfaces, one external and one internal interface. All other components within the system will connect kafka to internal interfaces.
At installation time internal ips on other host are not reachable, need to do some configuration to make them reachable, we do not have control over that. So, assume that when kafka is coming up, internal IPs on other nodes are not reachable to each other.
Scenario:
I have two nodes in cluster:
node1 (External IP: 10.10.10.4, Internal IP: 5.5.5.4)
node2 (External IP: 10.10.10.5, Internal IP: 5.5.5.5)
Now, while installation, 10.10.10.4 can ping to 10.10.10.5 and vice versa, but 5.5.5.4 can not reach to 5.5.5.5. That will happen once kafka installation is done and after that someone does some config to make it reachable, so before kafka installation, we can do make them reachable.
Now the requirement is kafka brokers will exchange the messages on 10.10.10 interface, such that cluster will be formed, but clients will send messages on 5.5.5.X interface.
What I tried was as below:
listeners=USERS://0.0.0.0:9092,REPLICATION://0.0.0.0:9093
advertised.listeners=USERS://5.5.5.5:9092,REPLICATION://5.5.5.5:9093
Where 5.5.5.5 is the internal ip address.
But with this, while restarting kafka, I see below logs:
{"log":"[2020-06-23 19:05:34,923] INFO Creating /brokers/ids/2 (is it secure? false) (kafka.zk.KafkaZkClient)\n","stream":"stdout","time":"2020-06-23T19:05:34.923403973Z"}
{"log":"[2020-06-23 19:05:34,925] INFO Result of znode creation at /brokers/ids/2 is: OK (kafka.zk.KafkaZkClient)\n","stream":"stdout","time":"2020-06-23T19:05:34.925237419Z"}
{"log":"[2020-06-23 19:05:34,926] INFO Registered broker 2 at path /brokers/ids/2 with addresses: ArrayBuffer(EndPoint(5.5.5.5,9092,ListenerName(USERS),PLAINTEXT), EndPoint(5.5.5.5,9093,ListenerName(REPLICATION),PLAINTEXT)) (kafka.zk.KafkaZkClient)\n","stream":"stdout","time":"2020-06-23T19:05:34.926127438Z"}
.....
{"log":"[2020-06-23 19:05:35,078] INFO Kafka version : 1.1.0 (org.apache.kafka.common.utils.AppInfoParser)\n","stream":"stdout","time":"2020-06-23T19:05:35.078444509Z"}
{"log":"[2020-06-23 19:05:35,078] INFO Kafka commitId : fdcf75ea326b8e07 (org.apache.kafka.common.utils.AppInfoParser)\n","stream":"stdout","time":"2020-06-23T19:05:35.078471358Z"}
{"log":"[2020-06-23 19:05:35,079] INFO [KafkaServer id=2] started (kafka.server.KafkaServer)\n","stream":"stdout","time":"2020-06-23T19:05:35.079436798Z"}
{"log":"[2020-06-23 19:05:35,136] ERROR [KafkaApi-2] Number of alive brokers '0' does not meet the required replication factor '2' for the offsets topic (configured via 'offsets.topic.replication.factor'). This error can be ignored if the cluster is starting up and not all brokers are up yet. (kafka.server.KafkaApis)\n","stream":"stdout","time":"2020-06-23T19:05:35.136792119Z"}
And after that this msg continuously comes up.
{"log":"[2020-06-23 19:05:35,166] ERROR [KafkaApi-2] Number of alive brokers '0' does not meet the required replication factor '2' for the offsets topic (configured via 'offsets.topic.replication.factor'). This error can be ignored if the cluster is starting up and not all brokers are up yet. (kafka.server.KafkaApis)\n","stream":"stdout","time":"2020-06-23T19:05:35.166895344Z"}
Is there any way we can achieve that?
With regards,
-M-
I am trying to setup a small Spark cluster for testing. The cluster consists of 3 workers and one master.
On each node I setup Java, scala and spark.
The configuration files are as follow:
spark-defaults.conf:
spark.master spark://test01.scem:7077
spark.eventLog.enabled true
spark.eventLog.dir hdfs://test01.scem/user/spark/applicationHistory
spark.executor.memory 4g
spark.serializer org.apache.spark.serializer.KryoSerializer
spark.driver.memory 5g
spark.yarn.archive hdfs://test01.scem/user/spark
spark-env.sh
export SPARK_CONF_DIR=/usr/hadoop/spark-2.1.0-bin-hadoop2.7/conf
export SPARK_LOG_DIR=/var/log/spark
export SPARK_PID_DIR=/var/run/spark
export HADOOP_HOME=${HADOOP_HOME:-/usr/hadoop}
export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-/usr/hadoop/etc/hadoop}
I am able to start all nodes by (start-all.sh), but I recieve an error message on starting the shell (spark-shell).
I tried all available methods to view the UI for Spark cluster, but no luck, any help please.
The error message I receive is:
WARN client.StandaloneAppClient$ClientEndpoint: Failed to connect to master test01.scem:7077
org.apache.spark.SparkException: Exception thrown in awaitResult
The jps of each node is :
Master {18097 JobHistoryServer, 21249 Jps, 20758 NameNode, 20440
ResourceManager}
slaves {11456 JobHistoryServer, 15409 Jps, 15092 DataNode, 14799
NodeManager}
check if you can ping the master. if that's true check if the port 7077 is occupied on master using netstat command. if both are true it may be a firewall issue
Setup
We have a 3 node kafka cluster, processing messages coming in through nginx. The nginx hands it off to a php which in turn forks a python process and calls the KafkaClient, SimpleProducer & Send_Message
The zookeeper is running on the same host as kafka, nginx is on a separate host. The ports 2181, 2182, 3888, 9092 are all open. No errors seen in starting zookeeper, kafka. All this setup is on AWS in the same vpc.
Kafka & Zookeeper is running as kafka user, Nginx is running as nginx, php-fpm running as apache
Versions
Kafka: 0.8.2
Python: 2.7.5
Relevant snippets from property files.
zookeeper.properties
dataDir=/data/zookeeper
clientPort=2181
maxClientCnxns=100
tickTime=2000
initLimit=5
syncLimit=2
server.1=172.31.41.78:2888:3888
server.2=172.31.45.245:2888:3888
server.3=172.31.23.101:2888:3888
producer.properties
metadata.broker.list=172.31.41.78:9092,172.31.45.245:9092,172.31.23.101:9092
consumer.properties
zookeeper.connect=172.31.41.78:2181,172.31.45.245:2181,172.31.23.101:2181
zookeeper.connection.timeout.ms=6000
server.properties (setup with appropriate IP on other machines)
port=9092
advertised.host.name=172.31.41.78
host.name=172.31.41.78
zookeeper.connect=172.31.41.78:2181,172.31.45.245:2181,172.31.23.101:2181
zookeeper.connection.timeout.ms=60000
php code
function sendDataToKafka($_data,$_srcType) {
try{
$pyKafka = "/usr/bin/python /etc/nginx/html/postMon.py ".$_srcType;
$dspec = array(
0 => array("pipe","r"),
1 => array("pipe","w"),
2 => array("file","/dev/null", "a")
);
$process = proc_open($pyKafka,$dspec,$pipes);
if (is_resource($process)) {
if(fwrite($pipes[0],$_data) == true){
fclose($pipes[0]);
echo stream_get_contents($pipes[1]);
fclose($pipes[1]);
proc_close($process);
echo "Process completed";
python code
import sys,json,time,ConfigParser
import traceback
sys.path.append("/etc/nginx/html/kafka_python-0.9.4-py2.7.egg")
from kafka import KafkaClient,SimpleProducer
try:
srcMap = {
'Alert' : 'alerts'
}
topic = srcMap.get(sys.argv[1],'events')
data = ''
data = 'Testing static Kafka message'
print 'Host: 172.31.23.101:9092'
kafka = KafkaClient("172.31.23.101:9092")
producer = SimpleProducer(kafka,random_start=True)
producer.send_messages(topic,data);
except Exception as e: # most generic exception you can catch
print str(e)
Scenarios
Scenario 1:
Running a
bin/kafka-console-producer.sh --zookeeper 172.31.41.78:2181,172.31.45.245:2181,172.31.23.101:2181 --topic alerts
on 1 shell
and
running
./kafka-console-consumer.sh --zookeeper 172.31.41.78:2181,172.31.45.245:2181,172.31.23.101:2181 --topic alerts
we are able to view messages
Scenario 2:
Running the python code command line (from the nginx host), able to view messages from the consumer
Scenario 3:
Running the php code command line (from nginx host), able to view messages from the consumer
Scenario 4:
Running from a REST client (as POSTMAN)/ CURL using the REST URL, get the following message:
<html>
<body>
Host: 172.31.23.101:9092
All servers failed to process request
<pre>Process completed</pre></body>
<html>
This shows, traffic going to nginx, nginx executing the php & python scripts, but erroring out when the first call to Kafka is made - KafkaClient happens. Somehow the python is unable to access Kafka.
Don't know if this is user permission/ silly config mistake.
Also......
We have a similar working setup in another vpc
The security groups, config files, codebase properties etc. are consistent
Upgrade options are not a possibility in near term
Any pointers/help/fresh pair of eyes would really help us going.
Thanks !
Finally figured the apache user did not have "right" permission.
selinuxconlist apache helped fix the issue.
I enabled JMX on Kafka brokers by adding
KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false
-Djava.rmi.server.hostname=<server_IP>
-Djava.net.preferIPv4Stack=true"
However, when I use kafka.tools.JmxTool to get the JMX metrics, it outputs Unix timestamps only. Why?
./bin/kafka-run-class.sh kafka.tools.JmxTool \
--object-name 'kafka.server:type=BrokerTopicMetrics,name=AllTopicsMessagesInPerSec' \
--jmx-url "service:jmx:rmi:///jndi/rmi://<server_IP>:9111/jmxrmi"
How can I have it print out the metrics?
Edit bin/kafka-run-class.sh and set KAFKA_JMX_OPTS variable
KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=your.kafka.broker.hostname -Djava.net.preferIPv4Stack=true"
Update bin/kafka-server-start.sh add the below line
export JMX_PORT=PORT
You must set 'JMX_PORT' variable, or add the following line to bin/kafka-server-start.sh.
export JMX_PORT=${JMX_PORT:-9999}
then you will be able to connect to Kafka JMX metrics. I use jconsole tool and 'localhost:9999' address.
Setting JMX_PORT inside bin/kafka-run-class.sh will clash with Zookeeper, if you are running Zookeeper on the same node.
Best is to set JMX port individually inside corresponding server-start scripts:
Insert line “export JMX_PORT=${JMX_PORT:-9998}” before last line in $KAFKA_HOME/bin/zookeeper-server-start.sh file.
Restart the Zookeeper server.
Repeat steps 1 and 2 for all zookeeper nodes in the cluster.
Insert line “export JMX_PORT=${JMX_PORT:-9999}” before last line in $KAFKA_HOME/bin/kafka-server-start.sh file.
Restart the Kafka Broker.
Repeat steps 4 and 5 for all brokers in the cluster.
If you're running via systemd:
edit /etc/systemd/system/multi-user.target.wants/kafka.service
in the "[service]" section add a line:
Environment=JMX_PORT=9989
reload: systemctl daemon-reload
restart: systemctl restart kafka
enjoy the beans: echo 'beans' | java -jar jmxterm-1.0-alpha-4-uber.jar -l localhost:9989 -n 2>&1
This is Kafka 2.3.0.
Using jconsole For Available MBeans
You should use jconsole first to know the names of the MBeans available.
The proper name of the MBean you wanted to query metrics of is kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec (the AllTopics prefix was used in older verions). Thanks AndyTheEntity.
Enabling Remote JMX (with no authentication or SSL)
As described in Monitoring and Management Using JMX Technology you should set certain system properties when you start the Java VM of a Kafka broker.
Kafka's bin/kafka-run-class.sh shell script makes the configuration painless as it does the basics for you and sets KAFKA_JMX_OPTS.
# JMX settings
if [ -z "$KAFKA_JMX_OPTS" ]; then
KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false "
fi
For remote JMX you should set com.sun.management.jmxremote.port that Kafka's bin/kafka-run-class.sh shell script sets using JMX_PORT environment variable.
# JMX port to use
if [ $JMX_PORT ]; then
KAFKA_JMX_OPTS="$KAFKA_JMX_OPTS -Dcom.sun.management.jmxremote.port=$JMX_PORT "
fi
With that, enabling remote JMX is as simple as the following command:
JMX_PORT=9999 ./bin/kafka-server-start.sh config/server.properties
Using JmxTool
With the above, run the JmxTool:
$ ./bin/kafka-run-class.sh kafka.tools.JmxTool \
--object-name 'kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec'
Trying to connect to JMX url: service:jmx:rmi:///jndi/rmi://:9999/jmxrmi.
"time","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:Count","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:EventType","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:FifteenMinuteRate","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:FiveMinuteRate","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:MeanRate","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:OneMinuteRate","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:RateUnit"
1567586728595,0,messages,0.0,0.0,0.0,0.0,SECONDS
1567586730597,0,messages,0.0,0.0,0.0,0.0,SECONDS
...
You could use --one-time option to print the JMX metrics just once.
$ ./bin/kafka-run-class.sh kafka.tools.JmxTool \
--object-name 'kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec' \
--one-time true
Trying to connect to JMX url: service:jmx:rmi:///jndi/rmi://:9999/jmxrmi.
"time","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:Count","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:EventType","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:FifteenMinuteRate","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:FiveMinuteRate","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:MeanRate","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:OneMinuteRate","kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec:RateUnit"
1567586898459,0,messages,0.0,0.0,0.0,0.0,SECONDS
vim kafka_2.11-0.10.1.1/bin/kafka-run-class.sh
and then add the first two lines and comment as I have done for other lines, (Note : after doing this Kafka scripts cannot be used for client operations for listing topics.. for your client operations you need to use a separate scripts , download again in different locations and use)
export JMX_PORT=9096
KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=<ipaddress> -Dcom.sun.management.jmxremote.port=$JMX_PORT -Dcom.sun.management.jmxremote.rmi.port=$JMX_PORT"
# JMX settings
#if [ -z "$KAFKA_JMX_OPTS" ]; then
# KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false "
#fi
# JMX port to use
#if [ $JMX_PORT ]; then
# KAFKA_JMX_OPTS="$KAFKA_JMX_OPTS -Dcom.sun.management.jmxremote.port=$JMX_PORT "
#fi
Use kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec
The AllTopics prefix was used in older verions. You can specify topic using kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec,topic=<topic-name>
src: http://grokbase.com/t/kafka/users/164ksnhff0/enable-jmx-on-kafka-brokers
Kafka has provided all you need. When your start your server, activate the KAFKA_JMX_OPTS arg by using these command:
$KAFKA_JMX_OPTS JMX_PORT=[your_port_number] ./kafka-server-start.sh -daemon ../config/server.properties
Using those command, you activated JMX Remote and related port. Then you can connect your JConsole or another monitoring tools.
Just before calling kafka-server-start.sh add following exports. It worked like a charm for my case. You can set desired port for JMX_PORT and you should set broker for $BROKER_IP part.
export JMX_PORT=9900
export KAFKA_JMX_OPTS="-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Djava.rmi.server.hostname=$BROKER_IP -Djava.net.preferIPv4Stack=true"
This is standard Kafka start procedure:
bin/kafka-server-start.sh config/server.properties
This is Kafka start procedure with JMX:
JMX_PORT=8004 bin/kafka-server-start.sh config/server.properties