Can't start zookeeper - apache-kafka

I'm using confluent platform, the zookeeper is active with status lookup. but when I try to start kafka with confluent it shows zookeeper is down.
$ sudo service zookeeper status
Redirecting to /bin/systemctl status zookeeper.service
● zookeeper.service - Zookeeper
Loaded: loaded (/etc/systemd/system/zookeeper.service; disabled; vendor preset: disabled)
Active: active (running) since Tue 2017-08-08 17:25:34 PDT; 16h ago
Docs: http://kafka.apache.org/documentation.html
Process: 3774 ExecStop=/var/www/confluent/bin/zookeeper-server-stop (code=exited, status=1/FAILURE)
Main PID: 3785 (java)
CGroup: /system.slice/zookeeper.service
└─3785 java -Xmx512M -Xms512M -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+DisableExplicitGC -Djava.awt.headless=true -Xloggc:/var/log...
zookeeper[3785]: [2017-08-08 17:26:09,005] INFO Processed session termination for sessionid: 0x15dc460fd0c0000 (org.apache.zooke...Processor)
zookeeper[3785]: [2017-08-08 17:26:39,000] INFO Expiring session 0x15dc4364baf0004, timeout of 60000ms exceeded (org.apache.zook...perServer)
zookeeper[3785]: [2017-08-08 17:26:39,000] INFO Expiring session 0x15dc4364baf0002, timeout of 60000ms exceeded (org.apache.zook...perServer)
zookeeper[3785]: [2017-08-08 17:26:39,000] INFO Expiring session 0x15dc4364baf0003, timeout of 60000ms exceeded (org.apache.zook...perServer)
zookeeper[3785]: [2017-08-08 17:26:39,001] INFO Processed session termination for sessionid: 0x15dc4364baf0004 (org.apache.zooke...Processor)
zookeeper[3785]: [2017-08-08 17:26:39,002] INFO Processed session termination for sessionid: 0x15dc4364baf0002 (org.apache.zooke...Processor)
zookeeper[3785]: [2017-08-08 17:26:39,002] INFO Processed session termination for sessionid: 0x15dc4364baf0003 (org.apache.zooke...Processor)
zookeeper[3785]: [2017-08-09 09:56:26,711] INFO Accepted socket connection from /127.0.0.1:46446 (org.apache.zookeeper.server.NI...xnFactory)
zookeeper[3785]: [2017-08-09 09:59:14,796] WARN Exception causing close of session 0x0 due to java.io.IOException: Len error -72...erverCnxn)
zookeeper[3785]: [2017-08-09 09:59:14,796] INFO Closed socket connection for client /127.0.0.1:46446 (no session established for...erverCnxn)
Hint: Some lines were ellipsized, use -l to show in full.
$ confluent start kafka
Starting zookeeper
|Zookeeper failed to start
zookeeper is [DOWN]
Cannot start Kafka, Zookeeper is not running. Check your deployment

This is because zookeeper is already running, you can check the process with
ps aux|grep zookeeper
and kill the process manually and it is gonna work.

The most common cause for the message you are seeing when running:
confluent start kafka
and informs you that zookeeper is down, is that there's another zookeeper instance that is currently running, and the new zookeeper instance can not bind to its required port (by default this port is 2181).
A few options at your disposal to figure out what's the other zookeeper instance that is currently running when you try to issue confluent start kafka are:
run jps to see the running java processes. Zookeeper is the process named QuorumPeerMain next to its process ID. (equivalent to running ps xuaww | grep -i zookeeper or equivalent).
run lsof -i :2181 to figure out what the process that is running and has reserved the default zookeeper port (in this example 2181, but might be different in your system).
Try running confluent start kafka again after stopping the above process.

I received the same message. In my case I didn't set $JAVA_HOME variable properly.

You are mixing two installations.
confluent start kafka would depend on you running confluent start zookeeper.
Rather, it seems you already have systemctl running Zookeeper, so you should ideally just configure your server.properties and use the regular kafka-server-start script. And/or create a systemctl file for Kafka.

run $ confluent log zookeeper you will be able to see the log for any errors
there is high chance zookeeper is already running and using the port 2181,
use $ sudo lsof -i :1-2181 to see which process is using that port and try to kill and try again or
run $ sudo netstat -plten | grep java to see processes and ports they are on.
run kill -9 <pid> to kill the process.

Related

Can't change kafka broker-id in Incubator Helm chart?

I have one Zookeeper server (say xx.xx.xx.xxx:2181) running on one GCP Compute Instance VM separately.
I have 3 GKE clusters all in different regions on which I am trying to install Kafka broker nodes so that all nodes connect to one Zookeeper server(xx.xx.xx.xxx:2181).
I installed the Zookeeper server on the VM following this guide with zookeeper properties looking like below:
dataDir=/tmp/data
clientPort=2181
maxClientCnxns=0
initLimit=5
syncLimit=2
tickTime=2000
# list of servers
server.1=0.0.0.0:2888:3888
I am using this Incubator Helm Chart to deploy the brokers on GKE clusters.
As per the README.md I am trying to install with the below command:
helm repo add incubator http://storage.googleapis.com/kubernetes-charts-incubator
helm install --name my-kafka \
--set replicas=1,zookeeper.enabled=false,configurationOverrides."broker\.id"=1,configurationOverrides."zookeeper\.connect"="xx.xx.xx.xxx:2181" \
incubator/kafka
Error
When I deploy using any of the above ways described above on all of the three GKE Clusters, only one of the brokers gets connected to the Zookeeper server and the other two pods just restart infinitely.
When I check the Zookeeper log (on the VM), it looks something like below:
...
[2019-10-30 14:32:30,930] INFO Accepted socket connection from /xx.xx.xx.xxx:54978 (org.apache.zookeeper.server.NIOServerCnxnFactory)
[2019-10-30 14:32:30,936] INFO Client attempting to establish new session at /xx.xx.xx.xxx:54978 (org.apache.zookeeper.server.ZooKeeperServer)
[2019-10-30 14:32:30,938] INFO Established session 0x100009621af0057 with negotiated timeout 6000 for client /xx.xx.xx.xxx:54978 (org.apache.zookeeper.server.ZooKeeperServer)
[2019-10-30 14:32:32,335] INFO Got user-level KeeperException when processing sessionid:0x100009621af0057 type:create cxid:0xc zxid:0x422 txntype:-1 reqpath:n/a Error Path:/config/users Error:KeeperErrorCode = NodeExists for /config/users (org.apache.zookeeper.server.PrepRequestProcessor)
[2019-10-30 14:32:34,472] INFO Got user-level KeeperException when processing sessionid:0x100009621af0057 type:create cxid:0x14 zxid:0x424 txntype:-1 reqpath:n/a Error Path:/brokers/ids/0 Error:KeeperErrorCode = NodeExists for /brokers/ids/0 (org.apache.zookeeper.server.PrepRequestProcessor)
[2019-10-30 14:32:35,126] INFO Processed session termination for sessionid: 0x100009621af0057 (org.apache.zookeeper.server.PrepRequestProcessor)
[2019-10-30 14:32:35,127] INFO Closed socket connection for client /xx.xx.xx.xxx:54978 which had sessionid 0x100009621af0057 (org.apache.zookeeper.server.NIOServerCnxn)
[2019-10-30 14:36:49,123] INFO Expiring session 0x100009621af003b, timeout of 6000ms exceeded (org.apache.zookeeper.server.ZooKeeperServer)
...
I am sure I have created firewall rules to open necessary ports and that is not a problem because one of the broker nodes is able to connect (the one who reaches first).
To me, this seems like the borkerID are not getting changed for some reason and that is the reason why Zookeeper is rejecting the connections.
I say this because kubectl logs pod/my-kafka-n outputs something like below:
...
[2019-10-30 19:56:24,614] INFO [SocketServer brokerId=0] Shutdown completed (kafka.network.SocketServer)
...
[2019-10-30 19:56:24,627] INFO [KafkaServer id=0] shutting down (kafka.server.KafkaServer)
...
As we can see above says brokerId=0 for all of the pods in all the 3 clusters.
However, when I do kubectl exec -ti pod/my-kafka-n -- env | grep BROKER, I can see the environment variable KAFKA_BROKER_ID is changed to 1, 2 and 3 for different brokers as I set.
What am I doing wrong? What is the correct way to change the kafka-broker id or to make all brokers connect to one Zookeeper instance?
make all brokers connect to one Zookeeper instance?
Seems like you are doing that okay via the configurationOverrides option. That'll deploy all pods with the same configuration.
That being said, the broker ID should not be the same per pod. If you inspect the statefulset YAML, it appears that the broker ID is calculated based on the POD_NAME variable
Sidenote
3 GKE clusters all in different regions on which I am trying to install Kafka broker nodes so that all nodes connect to one Zookeeper server
It's not clear to me how you would able to deploy to 3 sepearate clusters in one API call. But, this architecture isn't recommended by Kafka, Zookeeper, or Kubernetes communities unless these regions are "geographically close"

Flink: HA mode killing leading jobmanager terminating standby jobmanagers

I am trying to get Flink to run in HA mode using Zookeeper, but when I try to test it by killing the leader JobManager all my standby jobmanagers get killed too.
So instead of a standby jobmanager taking over as the new Leader, they all get killed which isn't supposed to happen.
My setup:
4 servers, 3 of those servers have Zookeeper running, but only 1 server will host all the JobManagers.
ad011.local: Zookeeper + Jobmanagers
ad012.local: Zookeeper + Taskmanager
ad013.local: Zookeeper
ad014.local: nothing interesting
My masters file looks like this:
ad011.local:8081
ad011.local:8082
ad011.local:8083
My flink-conf.yaml:
jobmanager.rpc.address: ad011.local
blob.server.port: 6130,6131,6132
jobmanager.heap.mb: 512
taskmanager.heap.mb: 128
taskmanager.numberOfTaskSlots: 4
parallelism.default: 2
taskmanager.tmp.dirs: /var/flink/data
metrics.reporters: jmx
metrics.reporter.jmx.class: org.apache.flink.metrics.jmx.JMXReporter
metrics.reporter.jmx.port: 8789,8790,8791
high-availability: zookeeper
high-availability.zookeeper.quorum: ad011.local:2181,ad012.local:2181,ad013.local:2181
high-availability.zookeeper.path.root: /flink
high-availability.zookeeper.path.cluster-id: /cluster-one
high-availability.storageDir: /var/flink/recovery
high-availability.jobmanager.port: 50000,50001,50002
When I run flink by using start-cluster.sh script I see my 3 JobManagers running, and going to the WebUI they all point to ad011.local:8081, which is the leader. Which is okay I guess?
I then try to test the failover by killing the leader using kill and then all my other standby JobManagers stop too.
This is what I see in my standby JobManager logs:
2017-09-29 08:08:41,590 INFO org.apache.flink.runtime.jobmanager.JobManager - Starting JobManager at akka.tcp://flink#ad011.local:50002/user/jobmanager.
2017-09-29 08:08:41,590 INFO org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService - Starting ZooKeeperLeaderElectionService org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService#72d546c8.
2017-09-29 08:08:41,598 INFO org.apache.flink.runtime.webmonitor.WebRuntimeMonitor - Starting with JobManager akka.tcp://flink#ad011.local:50002/user/jobmanager on port 8083
2017-09-29 08:08:41,598 INFO org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - Starting ZooKeeperLeaderRetrievalService.
2017-09-29 08:08:41,645 INFO org.apache.flink.runtime.webmonitor.JobManagerRetriever - New leader reachable under akka.tcp://flink#ad011.local:50000/user/jobmanager:f7dc2c48-dfa5-45a4-a63e-ff27be21363a.
2017-09-29 08:08:41,651 INFO org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - Starting ZooKeeperLeaderRetrievalService.
2017-09-29 08:08:41,722 INFO org.apache.flink.runtime.clusterframework.standalone.StandaloneResourceManager - Received leader address but not running in leader ActorSystem. Cancelling registration.
2017-09-29 09:26:13,472 WARN akka.remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://flink#ad011.local:50000] has failed, address is now gated for [5000] ms. Reason: [Disassociated]
2017-09-29 09:26:14,274 INFO org.apache.flink.runtime.jobmanager.JobManager - RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.
2017-09-29 09:26:14,284 INFO org.apache.flink.runtime.blob.BlobServer - Stopped BLOB server at 0.0.0.0:6132
Any help would be appreciated.
Solved it by running my cluster using ./bin/start-cluster.sh instead of using service files (which calls the same script), the service file kills the other jobmanagers apparently.

I can't run zookeeper

i am new to the kafka world,
i want to start zookeeper then when i type this
bin/zookeeper-server-start.sh config/zookeeper.properties
I got the following error
ERROR Unexpected exception, exiting abnormally (org.apache.zookeeper.server.ZooKeeperServerMain)
java.net.BindException: Address already in use
ERROR Unexpected exception, exiting abnormally (org.apache.zookeeper.server.ZooKeeperServerMain)
java.net.BindException: Address already in use
Then i tried netstat -nlp|grep 2181
but there is no process running
tcp 0 0 0.0.0.0:2181 0.0.0.0:* LISTEN -
Some light please
For this case,
You need to see if zookeeper is running or not
use below command
sudo lsof -i :2181
You will get
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 1005 zookeeper 33u IPv6 17209 0t0 TCP *:2181 (LISTEN)
java 1005 zookeeper 34u IPv6 327225 0t0 TCP localhost:2181->localhost:43566 (ESTABLISHED)
java 22585 root 88u IPv6 324552 0t0 TCP localhost:43566->localhost:2181 (ESTABLISHED)
like statement. Now kill the zookeeper to start again.
sudo kill -9 1005
Then use below to start zookeeper
bin/zookeeper-server-start.sh config/zookeeper.properties
Sounds like zookeeper server is running.
try:
bin/zkServer.sh stop from the zookeeper directory to shut it down and then:
bin/zookeeper-server-start.sh config/zookeeper.properties
from the kafka directory
That fixed my issue
There must be some stale process using the port 2181.I had the same issue.First I checked the status of the server:
/usr/share/zookeeper$ bin/zkServer.sh status
or
/usr/share/zookeeper$ echo status | nc 127.0.0.1 2181
Then, I tried to start kafka and it failed with the same error. I changed permission and run as sudo..it didn't work either.
Since i couldn't see any process using the port. I restarted my computer and it worked!!.
Check if zookeeper is already running or not by using this command.
bin/kafka-topics.sh --list --zookeeper localhost:2181
Check if you get number of topics, if you get any that means Zookeeper was already running.
So verify whether Zookeeper is already running or not.
If you are running command bin/zookeeper-server-start.sh config/zookeeper.properties
and getting error :
ERROR Unexpected exception, exiting abnormally (org.apache.zookeeper.server.ZooKeeperServerMain)
java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
In that case in your virtual machine port 2181 is already using by zookeeper.
so in kafka zookeeper.properties change the clientPort value to the port which is not in use like 5181
Again run the command and Zookeeper will start working.
First, stop the Zookeeper with the command below:
$ bin/zookeeper-server-stop.sh config/zookeeper.properties
Then, start it again and you should be good to go:
$ bin/zookeeper-server-start.sh config/zookeeper.properties
May be another user is running the process. check using jps if any process with Quorum is running kill it and then try
I have solved the problem by doing following commands.
Go to kafka folder where you installed and type sudo bin/zookeeper-server-stop.sh
bin/zookeeper-server-start.sh config/zookeeper.properties
I hope this helps. Best of luck!
Maybe you can stop your hbase first.
just like this follow..
[root#master kafka_2.11-0.10.1.0]# stop-hbase.sh
stopping hbase................
localhost: stopping zookeeper.
[root#master kafka_2.11-0.10.1.0]# jps
2903 ResourceManager
60745 Worker
2586 NameNode
2762 SecondaryNameNode
93996 Jps
60653 Master
[root#master kafka_2.11-0.10.1.0]# bin/zookeeper-server-start.sh config/zookeeper.properties
[2019-12-05 01:09:43,959] INFO Reading configuration from: config/zookeeper.properties (org.apache.zookeeper.server.quorum.QuorumPeerConfig)
[2019-12-05 01:09:43,965] INFO autopurge.snapRetainCount set to 3 (org.apache.zookeeper.server.DatadirCleanupManager)
[2019-12-05 01:09:43,965] INFO autopurge.purgeInterval set to 0 (org.apache.zookeeper.server.DatadirCleanupManager)
[2019-12-05 01:09:43,965] INFO Purge task is not scheduled. (org.apache.zookeeper.server.DatadirCleanupManager)
[2019-12-05 01:09:43,965] WARN Either no config or no quorum defined in config, running in standalone mode (org.apache.zookeeper.server.quorum.QuorumPeerMain)
[2019-12-05 01:09:44,013] INFO Reading configuration from: config/zookeeper.properties (org.apache.zookeeper.server.quorum.QuorumPeerConfig)
[2019-12-05 01:09:44,013] INFO Starting server (org.apache.zookeeper.server.ZooKeeperServerMain)
[2019-12-05 01:09:44,023] INFO Server environment:zookeeper.version=3.4.8--1, built on 02/06/2016 03:18 GMT (org.apache.zookeeper.server.ZooKeeperServer)
[2019-12-05 01:09:44,023] INFO Server environment:host.name=master (org.apache.zookeeper.server.ZooKeeperServer)
[2019-12-05 01:09:44,023] INFO Server environment:java.version=1.8.0_171 (org.apache.zookeeper.server.ZooKeeperServer)
[2019-12-05 01:09:44,023] INFO Server environment:java.vendor=Oracle Corporation (org.apache.zookeeper.server.ZooKeeperServer)
[2019-12-05 01:09:44,023] INFO Server environment:java.home=/usr/local/soft/jdk1.8.0_171/jre (org.apache.zookeeper.server.ZooKeeperServer)
[2019-12-05 01:09:44,023] INFO Server environment:java.class.path=:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/aopalliance-repackaged-2.4.0-b34.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/argparse4j-0.5.0.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/connect-api-0.10.1.0.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/connect-file-0.10.1.0.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/connect-json-0.10.1.0.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/connect-runtime-0.10.1.0.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/guava-18.0.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/hk2-api-2.4.0-b34.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/hk2-locator-2.4.0-b34.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/hk2-utils-2.4.0-b34.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/jackson-annotations-2.6.0.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/jackson-core-2.6.3.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/jackson-databind-2.6.3.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/jackson-jaxrs-base-2.6.3.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/jackson-jaxrs-json-provider-2.6.3.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/jackson-module-jaxb-annotations-2.6.3.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/javassist-3.18.2-GA.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/javax.annotation-api-1.2.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/javax.inject-1.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/javax.inject-2.4.0-b34.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/javax.servlet-api-3.1.0.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/javax.ws.rs-api-2.0.1.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/jersey-client-2.22.2.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/jersey-common-2.22.2.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/jersey-container-servlet-2.22.2.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/jersey-container-servlet-core-2.22.2.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/jersey-guava-2.22.2.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/jersey-media-jaxb-2.22.2.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/jersey-server-2.22.2.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/jetty-continuation-9.2.15.v20160210.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/jetty-http-9.2.15.v20160210.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/jetty-io-9.2.15.v20160210.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/jetty-security-9.2.15.v20160210.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/jetty-server-9.2.15.v20160210.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/jetty-servlet-9.2.15.v20160210.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/jetty-servlets-9.2.15.v20160210.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/jetty-util-9.2.15.v20160210.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/jopt-simple-4.9.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/kafka_2.11-0.10.1.0.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/kafka_2.11-0.10.1.0-sources.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/kafka_2.11-0.10.1.0-test-sources.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/kafka-clients-0.10.1.0.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/kafka-log4j-appender-0.10.1.0.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/kafka-streams-0.10.1.0.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/kafka-streams-examples-0.10.1.0.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/kafka-tools-0.10.1.0.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/log4j-1.2.17.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/lz4-1.3.0.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/metrics-core-2.2.0.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/osgi-resource-locator-1.0.1.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/reflections-0.9.10.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/rocksdbjni-4.9.0.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/scala-library-2.11.8.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/scala-parser-combinators_2.11-1.0.4.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/slf4j-api-1.7.21.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/slf4j-log4j12-1.7.21.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/snappy-java-1.1.2.6.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/validation-api-1.1.0.Final.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/zkclient-0.9.jar:/usr/local/soft/kafka_2.11-0.10.1.0/bin/../libs/zookeeper-3.4.8.jar (org.apache.zookeeper.server.ZooKeeperServer)
[2019-12-05 01:09:44,024] INFO Server environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib (org.apache.zookeeper.server.ZooKeeperServer)
[2019-12-05 01:09:44,024] INFO Server environment:java.io.tmpdir=/tmp (org.apache.zookeeper.server.ZooKeeperServer)
[2019-12-05 01:09:44,024] INFO Server environment:java.compiler=<NA> (org.apache.zookeeper.server.ZooKeeperServer)
[2019-12-05 01:09:44,024] INFO Server environment:os.name=Linux (org.apache.zookeeper.server.ZooKeeperServer)
[2019-12-05 01:09:44,024] INFO Server environment:os.arch=amd64 (org.apache.zookeeper.server.ZooKeeperServer)
[2019-12-05 01:09:44,024] INFO Server environment:os.version=2.6.32-431.el6.x86_64 (org.apache.zookeeper.server.ZooKeeperServer)
[2019-12-05 01:09:44,024] INFO Server environment:user.name=root (org.apache.zookeeper.server.ZooKeeperServer)
[2019-12-05 01:09:44,024] INFO Server environment:user.home=/root (org.apache.zookeeper.server.ZooKeeperServer)
[2019-12-05 01:09:44,025] INFO Server environment:user.dir=/usr/local/soft/kafka_2.11-0.10.1.0 (org.apache.zookeeper.server.ZooKeeperServer)
[2019-12-05 01:09:44,035] INFO tickTime set to 3000 (org.apache.zookeeper.server.ZooKeeperServer)
[2019-12-05 01:09:44,035] INFO minSessionTimeout set to -1 (org.apache.zookeeper.server.ZooKeeperServer)
[2019-12-05 01:09:44,035] INFO maxSessionTimeout set to -1 (org.apache.zookeeper.server.ZooKeeperServer)
[2019-12-05 01:09:44,050] INFO binding to port 0.0.0.0/0.0.0.0:2181 (org.apache.zookeeper.server.NIOServerCnxnFactory)
I got the same problem. I got it because zookeeper server is crashed by my laptop crash.
I got solved by the help from the below link.
How to recover Zookeeper from java.io.EOFException after a server crash?
I found the offending log file by opening one by one [Zookeper-data-dir]/zookeeper_0/version-2. And found a log file without head or any other thing. When I deleted it, I got my problem solved and my zookeeper server started running normally.
Seems to be the ZooKeeper port 2181 is still in use, Please follow the below steps to address this issue:
Use the netstat command to find the process that is holding onto port 2181. Kill the process that is using the ZooKeeper port 2181:
$ netstat -antp | grep 2181
tcp 0 0 0.0.0.0:2181 0.0.0.0:*
LISTEN 28016/java <defunct>
$ kill -9 28016
Restarting the kafka process solves the issue.
There is existing zookeeper running, how should I remove this. And than stop the zookeeper
bin/kafka-topics.sh --list --zookeeper localhost:2181
test
Many mentioned the process for linux versions.
In case if we want to identify the same for windows, we can identify with below
From PowerShell, finding the port listening to 2181
PS C:\Users\<username>> netstat -ano | findstr 2181
TCP [::1]:2181 [::]:0 LISTENING 4564
Option #1: kill that process id using task kill
PS C:\Users\vishnus> taskkill /F /PID 4564
In case this option doesn't work and failing
ERROR: The process with PID 4564 could not be terminated.
Reason: Access is denied.
Option #2
Go to "Task Manager" --> Services tab, check for PID column and go to that process and right click and mark it as stop. then it will get stopped.
By this time, the earlier instance is killed, so you can start the zookeeper
I have the same problem and then I debug step by step and find the solution as follows:
stop the zookeeper:
/opt/kafka_2.13-2.7.0/bin/zookeeper-server-stop.sh
Then I check whether the zookeeper process has indeed been stopped:
ps aux | grep zookeeper
If the zookeeper has been stopped, then no zookeeper process should be displayed. You can also check the zookeeper process via port number or java process. If your port number for zookeeper is the default 2181, you can check which process is listening to this port by running sudo lsof -i :2181. You can also check the all the java processs and find the corresponding zookeeper process by running sudo ps -fC java
If the zookeeper has not been stopped, run sudo kill -9 zookeeper_process_id to kill it.
After this, we are sure that zookeeper have indeed been stopped. If we run jps, the QuorumPeerMain will not be shown.
After this, we can restart the zookeeper,
/opt/kafka_2.13-2.7.0/bin/zookeeper-server-start.sh /opt/kafka_2.13-2.7.0/config/zookeeper.properties
For windows user: below step will stop the zookeeper
step1 : netstat -ano | findstr :
NOTE: port number can be replaced by the port where zookeeper is runing
Step2: Now from image , the circled one is PID so copy that and put it in below command
taskkill /PID <PIDNO/F

testing kafka consumer and producer failed on connection

I have been trying to test a kafka installation and using the guide created a producer and consumer. When trying to retrieve a message I get the following error:
WARN Session 0x0 for server null, unexpected error, closing socket connection and
attempting reconnect (org.apache.zookeeper.ClientCnxn)
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1146)
[2014-03-04 18:01:20,628] INFO Terminate ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread)
[2014-03-04 18:01:21,315] INFO Opening socket connection to server kafka-test/192.xxxxxx.110:2182 (org.apache.zookeeper.ClientCnxn)
[2014-03-04 18:01:21,418] INFO Session: 0x0 closed (org.apache.zookeeper.ZooKeeper)
Exception in thread "main" org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 6000
at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:880)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:98)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:84)
at kafka.consumer.ZookeeperConsumerConnector.connectZk(ZookeeperConsumerConnector.scala:151)
at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:112)
at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:123)
at kafka.consumer.Consumer$.create(ConsumerConnector.scala:89)
at kafka.consumer.ConsoleConsumer$.main(ConsoleConsumer.scala:178)
at kafka.consumer.ConsoleConsumer.main(ConsoleConsumer.scala)
[2014-03-04 18:01:21,419] INFO EventThread shut down (org.apache.zookeeper.ClientCnxn)
Kafka
Looks like you're not connecting to Zookeeper correctly. I'm not sure of your setup (multi-machine, VMs, containers) so it's hard to say what's wrong. From the debug output I see the following line hinting at your expected Zookeeper IP:
[2014-03-04 18:01:21,315] INFO Opening socket connection to server kafka-test/192.xxxxxx.110:2182 (org.apache.zookeeper.ClientCnxn)
Kafka looks for Zookeeper at the address specified by the zookeeper.connect configuration property in the $KAFKA_HOME/config/server.properties file. Be sure to edit that before starting Kafka. Also, try giving the actual public IP of your Zookeeper instance, not just 127.0.0.1 as that solves a lot of confusion if you're running in containers. In your case it looks like it would be:
zookeeper.connect=192.xxxxxx.110:2182
Also relevant to the Kafka config if you're running on AWS or operating in a container, don't forget to update the following two configuration properties to make sure clients who connect to Kafka see the correct public IP
advertised.host.name
advertised.port
and Kafka sees the correct internal IP
host.name
port
Zookeeper
Zookeeper has some gotchas when setting it up as well. On your Zookeeper instance, don't forget to edit the server configuration property in the zoo.cfg (usually in /etc/zookeeper/conf) file to point to the correct IP for your Zookeeper instance. In your case probably the following:
server.1=192.xxxxxx.110:2888:3888
Those last two ports (2888 3888) are only needed if you're running a Zookeeper cluster (for followers to connect to the leader and Zookeeper leader election, respectively, so be sure to unblock them on firewallish things if you have multiple Zookeeper servers).
Check your zookeeper connection with telnet command:
telnet 192.xxxxxx.110 2181
You probably get an error, in which case check that the process is running:
ps -ef | grep "zookeeper.properties"
If it's not running, start it by going into kafka home directory:
bin/zookeeper-server-start.sh config/zookeeper.properties &
Something wrong with your Zookeper configuration. Make sure your zookeeper is up and running. The default port it runs on is 2181
Bit more info and some code could be useful I believe.
I hit the same issue and the problem was the max client connections property in zookeeper config.
if you see something like "maxClientCnxns = 20" in the config file in /etc/zookeeper/conf, comment it out and restart zookeeper.
You may also check if the all the connections available have already been exhausted. If you are using an API to connect to ZK, make sure you free up the connection after you're done.
I also meet the problem. When I shutdown the firewall of the zk node, it will work.

How do I use runit with zookeeper

How do I use runit with zookeeper? Its running but I get those nasty logs.....
Here is my run file
more /etc/sv/zookeeper/run
#!/bin/sh
exec 2>&1
exec /var/chef/cache/zookeeper-3.4.5/bin/zkServer.sh start >> /tmp/zookeeper.log 2>&1
below is the tail of my log file
tail -f zookeeper.log
Using config: /var/chef/cache/zookeeper-3.4.5/bin/../conf/zoo.cfg
Starting zookeeper ... already running as process 32701.
JMX enabled by default
Using config: /var/chef/cache/zookeeper-3.4.5/bin/../conf/zoo.cfg
Starting zookeeper ... already running as process 32701.
JMX enabled by default
Using config: /var/chef/cache/zookeeper-3.4.5/bin/../conf/zoo.cfg
Starting zookeeper ... already running as process 32701.
JMX enabled by default
Using config: /var/chef/cache/zookeeper-3.4.5/bin/../conf/zoo.cfg
Starting zookeeper ... already running as process 32701.
JMX enabled by default
Using config: /var/chef/cache/zookeeper-3.4.5/bin/../conf/zoo.cfg
Starting zookeeper ... already running as process 32701.
From the look of it, zookeeper is already running in your system. Or there is a lock file which didn't get deleted due to an unclean shutdown. Check using zkServer.sh status to see if it is running properly.