testing kafka consumer and producer failed on connection - apache-zookeeper

I have been trying to test a kafka installation and using the guide created a producer and consumer. When trying to retrieve a message I get the following error:
WARN Session 0x0 for server null, unexpected error, closing socket connection and
attempting reconnect (org.apache.zookeeper.ClientCnxn)
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1146)
[2014-03-04 18:01:20,628] INFO Terminate ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread)
[2014-03-04 18:01:21,315] INFO Opening socket connection to server kafka-test/192.xxxxxx.110:2182 (org.apache.zookeeper.ClientCnxn)
[2014-03-04 18:01:21,418] INFO Session: 0x0 closed (org.apache.zookeeper.ZooKeeper)
Exception in thread "main" org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 6000
at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:880)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:98)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:84)
at kafka.consumer.ZookeeperConsumerConnector.connectZk(ZookeeperConsumerConnector.scala:151)
at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:112)
at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:123)
at kafka.consumer.Consumer$.create(ConsumerConnector.scala:89)
at kafka.consumer.ConsoleConsumer$.main(ConsoleConsumer.scala:178)
at kafka.consumer.ConsoleConsumer.main(ConsoleConsumer.scala)
[2014-03-04 18:01:21,419] INFO EventThread shut down (org.apache.zookeeper.ClientCnxn)

Kafka
Looks like you're not connecting to Zookeeper correctly. I'm not sure of your setup (multi-machine, VMs, containers) so it's hard to say what's wrong. From the debug output I see the following line hinting at your expected Zookeeper IP:
[2014-03-04 18:01:21,315] INFO Opening socket connection to server kafka-test/192.xxxxxx.110:2182 (org.apache.zookeeper.ClientCnxn)
Kafka looks for Zookeeper at the address specified by the zookeeper.connect configuration property in the $KAFKA_HOME/config/server.properties file. Be sure to edit that before starting Kafka. Also, try giving the actual public IP of your Zookeeper instance, not just 127.0.0.1 as that solves a lot of confusion if you're running in containers. In your case it looks like it would be:
zookeeper.connect=192.xxxxxx.110:2182
Also relevant to the Kafka config if you're running on AWS or operating in a container, don't forget to update the following two configuration properties to make sure clients who connect to Kafka see the correct public IP
advertised.host.name
advertised.port
and Kafka sees the correct internal IP
host.name
port
Zookeeper
Zookeeper has some gotchas when setting it up as well. On your Zookeeper instance, don't forget to edit the server configuration property in the zoo.cfg (usually in /etc/zookeeper/conf) file to point to the correct IP for your Zookeeper instance. In your case probably the following:
server.1=192.xxxxxx.110:2888:3888
Those last two ports (2888 3888) are only needed if you're running a Zookeeper cluster (for followers to connect to the leader and Zookeeper leader election, respectively, so be sure to unblock them on firewallish things if you have multiple Zookeeper servers).

Check your zookeeper connection with telnet command:
telnet 192.xxxxxx.110 2181
You probably get an error, in which case check that the process is running:
ps -ef | grep "zookeeper.properties"
If it's not running, start it by going into kafka home directory:
bin/zookeeper-server-start.sh config/zookeeper.properties &

Something wrong with your Zookeper configuration. Make sure your zookeeper is up and running. The default port it runs on is 2181
Bit more info and some code could be useful I believe.

I hit the same issue and the problem was the max client connections property in zookeeper config.
if you see something like "maxClientCnxns = 20" in the config file in /etc/zookeeper/conf, comment it out and restart zookeeper.

You may also check if the all the connections available have already been exhausted. If you are using an API to connect to ZK, make sure you free up the connection after you're done.

I also meet the problem. When I shutdown the firewall of the zk node, it will work.

Related

Why is Zookeeper not re-electing new leader in Apache Nifi Cluster?

Following is my architecture
2 Servers:
Server 1: running Apache Nifi + Zookeeper (Not embedded)
Server 2: running Apache Nifi + Zookeeper (Not embedded)
To test failovers, I close down the Server that has been selected as Cluster Coordinator
In this case, zookeeper should automatically elect the remaining one server as leader. But it keeps failing and goes into continuous loop of trying to connect to the first server
Zookeeper Logs in Server 2 when leader (Server 1) went down:
2019-10-22 18:44:01,135 [myid:2] - WARN [NIOWorkerThread-2:NIOServerCnxn#370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-10-22 18:44:02,925 [myid:2] - WARN [NIOWorkerThread-3:NIOServerCnxn#370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-10-22 18:44:03,320 [myid:2] - WARN [QuorumPeer[myid=2](plain=/0:0:0:0:0:0:0:0:2181)(secure=disabled):QuorumCnxManager#677] -
Cannot open channel to 1 at election address ec2-server-1.compute-1.amazonaws.com/172.xx.x.x:3888
java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
Server 2 Config files:
zoo.cfg
tickTime=2000
initLimit=5
syncLimit=2
dataDir=/home/ec2-user/zookeeper
clientPort=2181
server.1=ec2-server-1.compute-1.amazonaws.com:2888:3888
server.2=0.0.0.0:2888:3888
nifi.properties
nifi.cluster.is.node=true
nifi.cluster.node.address=ec2-server-2.compute-1.amazonaws.com
nifi.cluster.node.protocol.port=8082
nifi.cluster.flow.election.max.wait.time=2 mins
nifi.cluster.flow.election.max.candidates=1
# zookeeper properties, used for cluster management #
nifi.zookeeper.connect.string=localhost:2181
nifi.zookeeper.root.node=/nifi
Server 1 Config files:
zoo.cfg
tickTime=2000
initLimit=5
syncLimit=2
dataDir=/home/ec2-user/zookeeper
clientPort=2181
server.1=0.0.0.0:2888:3888
server.2=ec2-server-2.compute-1.amazonaws.com:2888:3888
nifi.properties
nifi.cluster.is.node=true
nifi.cluster.node.address=ec2-server-1.compute-1.amazonaws.com
nifi.cluster.node.protocol.port=8082
nifi.cluster.flow.election.max.wait.time=2 mins
nifi.cluster.flow.election.max.candidates=1
# zookeeper properties, used for cluster management #
nifi.zookeeper.connect.string=localhost:2181
nifi.zookeeper.root.node=/nifi
What am I doing wrong?
You need at least three nodes to be able to handle the failure of one node.
Check the Admin guide:
Clustered (Multi-Server) Setup For reliable ZooKeeper service, you
should deploy ZooKeeper in a cluster known as an ensemble. As long as
a majority of the ensemble are up, the service will be available.
Because Zookeeper requires a majority, it is best to use an odd number
of machines. For example, with four machines ZooKeeper can only handle
the failure of a single machine; if two machines fail, the remaining
two machines do not constitute a majority. However, with five machines
ZooKeeper can handle the failure of two machines.
A simpler explanation here also

Zookeeper does not start normally (Established session 0x10000025c8a0001 with negotiated timeout 6000 )and kafka fails

I had previously run zookeeper and kafka successfully many times, and I believe my installation and configurations are correct.
The only change I made was to the zookeeper config file:
dataDir=/Users/garynackenson/Downloads/kafka_2.12-2.0.0/data/zookeeper
which I have created the directory for.
Now when I run zookeeper instead of getting info binding to port 0.0.0.0/0.0.0.0:2181
I get the error below, and kafka fails with a port 9092 in use error (i have restarted my machine and checked every way i know to see that port 9092 is not in use
the last message from zoopkeeper below, which does not look right
INFO Established session 0x10000025c8a0001 with negotiated timeout 6000 for client /127.0.0.1:49977 (org.apache.zookeeper.server.ZooKeeperServer)
When zookeeper starts that way kafka fails with a 9092 in use error (see below) – I restarted and checked that I am not using port 9092.
org.apache.kafka.common.KafkaException: Socket server failed to bind to 0.0.0.0:9092: Address already in use.
A little while later , I saw that zookeeper had a different issue:
INFO Closed socket connection for client /0:0:0:0:0:0:0:1:49986 which had sessionid 0x100000b679d0000 (org.apache.zookeeper.server.NIOServerCnxn)
I ran zookeeper again and saw the more ‘normal’ binding to 2081 4 messages up
[2018-10-03 18:25:08,064] INFO binding to port 0.0.0.0/0.0.0.0:2181 (org.apache.zookeeper.server.NIOServerCnxnFactory)
[2018-10-03 18:25:09,055] INFO Accepted socket connection from /127.0.0.1:50014 (org.apache.zookeeper.server.NIOServerCnxnFactory)
[2018-10-03 18:25:09,062] INFO Client attempting to renew session 0x10000025c8a0001 at /127.0.0.1:50014 (org.apache.zookeeper.server.ZooKeeperServer)
[2018-10-03 18:25:09,066] INFO Established session 0x10000025c8a0001 with negotiated timeout 6000 for client /127.0.0.1:50014 (org.apache.zookeeper.server.ZooKeeperServer)
but kafka is still failing every time
also sometimes i get the following message when i start zookeeper
[2018-10-03 18:10:36,097] INFO Got user-level KeeperException when processing sessionid:0x10000025c8a0001 type:delete cxid:0x47 zxid:0x179 txntype:-1 reqpath:n/a Error Path:/admin/preferred_replica_election Error:KeeperErrorCode = NoNode for /admin/preferred_replica_election (org.apache.zookeeper.server.PrepRequestProcessor)

Cannot start Zookeeper due to: Exception causing close of session 0x0 due to java.io.EOFException

I am trying to start up Zookeeper via the CLI with the command:
bin/zookeeper-server-start.sh ../config/zookeeper.properties
And it hums along for a second with what all seems to be correct until it says this:
INFO binding to port 0.0.0.0/0.0.0.0:2181 (org.apache.zookeeper.server.NIOServerCnxnFactory)
and then the below loops indefinitely until I exit:
[2018-08-10 15:07:48,223] INFO Accepted socket connection from /172.31.39.32:46374 (org.apache.zookeeper.server.NIOServerCnxnFactory)
[2018-08-10 15:07:48,228] WARN Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running (org.apache.zookeeper.server.NIOServerCnxn)
[2018-08-10 15:07:48,228] INFO Closed socket connection for client /172.31.39.32:46374 (no session established for client) (org.apache.zookeeper.server.NIOServerCnxn)
This is a single server and I believe a single node test server, so there isn't a quorum or other pieces running. My zookeeper config is basic, it only contains this:
dataDir=/tmp/zookeeper
clientPort=2181
maxClientCnxns=0
The weird thing is, my zookeeper had been running fine, and I had made NO changes to the config. Pulled it down to try to fix something else to do a quick restart on the zookeeper, and it won't budge. I've checked, and nothing else is running on port 2181.
I see this question has been asked several times with no answers, any ideas?
This might be happening because of some corruption in zookeeper data. You should not set dataDir to /tmp/*. If your computer purges some data of /tmp, it will be difficult for zookeeper to restore the state upon restart. If you check the zookeeper logs, you should see some kind of exception there.
Since you mentioned this zookeeper instance is for test purpose only. You should set
dataDir to anything but /tmp and try restart.

Zookeeper refuses Kafka connection from an old client

I have a cluster configuration using Kubernetes on GCE, I have a pod for zookeeper and other for Kafka; it was working normally until Zookeeper get crashed and restarted, and it start refusing connections from the kafka pod:
Refusing session request for client /10.4.4.58:52260 as it has seen
zxid 0x1962630
The complete refusal log is here:
2017-08-21 20:05:32,013 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory#192] - Accepted socket connection from /10.4.4.58:52260
2017-08-21 20:05:32,013 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer#882] - Connection request from old client /10.4.4.58:52260; will be dropped if server is in r-o mode
2017-08-21 20:05:32,013 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer#901] - Refusing session request for client /10.4.4.58:52260 as it has seen zxid 0x1962630 our last zxid is 0xab client must try another server
2017-08-21 20:05:32,013 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#1008] - Closed socket connection for client /10.4.4.58:52260 (no session established for client)
Because the kafka maintain a zookeeper session which remember the last zxid it has seen. So when the zookeeper sevice go down and come again, the zk's zxid begin from a smaller value. and ZKserver think the kafka has seen a bigger zxid, so it refuse it.
Have a try to restart the kafka.
For the record, I had this problem and all my kafka were off.
But, my kafka-manager was still up and listening on zookeepers. Turning it off resolved the issue.
Related to the answer from #GuangshengZuo.... Steps
Stop any residual zookeeper instances - zookeeper-server-stop.bat
Start a fresh zookeeper- zookeeper-server-start.bat .\config\zookeeper.properties
This will do

kafka cant connect to zookeeper- FATAL Fatal error during KafkaServerStable startup

Well..every service in the world can connect to my zookeeper expect kafka. Below is my connection string in server.properties file
zk.connect=1.dzk.syd.druid.neo.com:2181, 2.dzk.syd.druid.neo.com:2181
Have have all ports on the two zookeeper servers ....total promiscuous mode. I can even telnet into the zookeeper server from the kafka server..
telnet 2.dzk.syd.druid.neo.com 2181
Trying 54.252.183.218...
Connected to 2.dzk.syd.druid.neo.com.
Escape character is '^]'.
So....rather confused on why kafka will not connect to zookeeper?
I am using ubuntu 12.04 and kafka 0.7.2
[2013-07-16 04:36:49,915] INFO Client environment:user.home=/root (org.apache.zookeeper.ZooKeeper)
[2013-07-16 04:36:49,915] INFO Client environment:user.dir=/etc/sv/kafka (org.apache.zookeeper.ZooKeeper)
[2013-07-16 04:36:49,916] INFO Initiating client connection, connectString=1.dzk.syd.druid.neo.com:2181, 2.dzk.syd.druid.neo.com:2181 sessionTimeout=6000 watcher=org.I0Itec.zkclient.ZkClient#39cc65b1 (org.apache.zookeeper.ZooKeeper)
[2013-07-16 04:36:49,935] INFO Terminate ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread)
[2013-07-16 04:36:49,938] FATAL Fatal error during KafkaServerStable startup. Prepare to shutdown (kafka.server.KafkaServerStartable)
org.I0Itec.zkclient.exception.ZkException: Unable to connect to 1.dzk.syd.druid.neo.com:2181, 2.dzk.syd.druid.neo.com:2181
at org.I0Itec.zkclient.ZkConnection.connect(ZkConnection.java:66)
at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:872)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:98)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:84)
at kafka.server.KafkaZooKeeper.startup(KafkaZooKeeper.scala:44)
at kafka.log.LogManager.<init>(LogManager.scala:93)
at kafka.server.KafkaServer.startup(KafkaServer.scala:58)
at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:34)
at kafka.Kafka$.main(Kafka.scala:47)
at kafka.Kafka.main(Kafka.scala)
Caused by: java.net.UnknownHostException: 2.dzk.syd.druid.neo.com: Name or service not known
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:894)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1286)
at java.net.InetAddress.getAllByName0(InetAddress.java:1239)
at java.net.InetAddress.getAllByName(InetAddress.java:1155)
at java.net.InetAddress.getAllByName(InetAddress.java:1091)
at org.apache.zookeeper.ClientCnxn.<init>(ClientCnxn.java:387)
at org.apache.zookeeper.ClientCnxn.<init>(ClientCnxn.java:332)
at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:383)
at org.I0Itec.zkclient.ZkConnection.connect(ZkConnection.java:64)
... 9 more
[2013-07-16 04:36:49,942] INFO Shutting down Kafka server (kafka.server.KafkaServer)
[2013-07-16 04:36:49,943] INFO shutdown scheduler kafka-logcleaner- (kafka.utils.KafkaScheduler)
[2013-07-16 04:36:49,944] INFO Kafka server shut down completed (kafka.server.KafkaServer)
In your kafka/config/server.properties, there should be a property
#host.name=localhost
if you have uncommented this, or set this to another name, then that name should be in the /etc/hosts file
It's been a while since this has been answered, but in case it could help someone here is how i fixed it :
Actually i am using an Ansible playbook to install Kafka cluster and the params generated in zookeeper.properties file were not correctly ordered :
server.1=0.0.0.0:2888:3888
server.2=kafka-4:2888:3888
server.3=kafka-5:2888:3888
server.4=kafka-3:2888:3888
server.5=kafka-2:2888:3888
Putting them in the right order,
server.1=0.0.0.0:2888:3888
server.2=kafka-2:2888:3888
server.3=kafka-3:2888:3888
server.4=kafka-4:2888:3888
server.5=kafka-5:2888:3888
Then restart Kafka service, fixed it.
Change this in zookeeper.properties
maxClientCnxns=0 to maxClientCnxns=1