How to programmatically detect which server in a ZooKeeper ensemble a client is connected to?
I'm using the Apache Curator API and I am listening for state changes in connection by registering ConnectionStateListener. I would like to know which server in the ensemble a client is connected to when the client reconnects if the server it was connected to goes down.
You can see this in the logs produced by Curator. In the example output below, the CuratorFramework client has been given 4 different ZooKeeper instances in the connectionString that it can connect to. As can be seen in the log, it choses the first:
21:13:45.384 [main] INFO org.apache.curator.framework.imps.CuratorFrameworkImpl - Starting
21:13:45.386 [main] INFO org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=127.0.0.1:2181,127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184 sessionTimeout=60000 watcher=org.apache.curator.ConnectionState#2876f0c
21:13:45.388 [main-SendThread(127.0.0.1:2181)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
21:13:45.388 [main-SendThread(127.0.0.1:2181)] INFO org.apache.zookeeper.ClientCnxn - Socket connection established to 127.0.0.1/127.0.0.1:2181, initiating session
21:13:45.392 [main-SendThread(127.0.0.1:2181)] INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on server 127.0.0.1/127.0.0.1:2181, sessionid = 0x14aac461eb70004, negotiated timeout = 40000
In case the ZooKeeper server that the client has connected to crashes, you will also see the new server that the client connects to in the logs:
21:23:03.675 [main-SendThread(127.0.0.1:2182)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server 127.0.0.1/127.0.0.1:2182. Will not attempt to authenticate using SASL (unknown error)
21:23:03.677 [main-SendThread(127.0.0.1:2182)] INFO org.apache.zookeeper.ClientCnxn - Socket connection established to 127.0.0.1/127.0.0.1:2182, initiating session
21:23:03.697 [main-SendThread(127.0.0.1:2182)] INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on server 127.0.0.1/127.0.0.1:2182, sessionid = 0x14aac461eb70004, negotiated timeout = 40000
21:23:03.697 [main-EventThread] INFO org.apache.curator.framework.state.ConnectionStateManager - State change: RECONNECTED
Related
My Kafka version:
/opt/kafka/bin/kafka-topics.sh --version
2.4.1 (Commit:c57222ae8cd7866b)
My Kafka cluster configuration looks like:
6 nodes Kafka cluster
6 x Zookeeper i.e. is installed on each node/broker
2 DC's, there are 3 nodes in each DC
rack-awareness feature is enabled on each node:
node1 DC1:
broker.id=1
broker.rack=dc1
node2 DC1:
broker.id=2
broker.rack=dc1
node3 DC1:
broker.id=3
broker.rack=dc1
node1 DC2:
broker.id=4
broker.rack=dc2
node2 DC2:
broker.id=5
broker.rack=dc2
node3 DC2:
broker.id=6
broker.rack=dc2
When the whole DC2 become down the kafka cluster stopped and node1 from DC1 show errors like this:
[2022-03-16 07:38:45,422] INFO Unable to read additional data from server sessionid 0x40000004f930002, likely server has closed socket, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2022-03-16 07:38:45,549] INFO Unable to read additional data from server sessionid 0x200ab15af610000, likely server has closed socket, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2022-03-16 07:38:45,787] INFO Client successfully logged in. (org.apache.zookeeper.Login)
[2022-03-16 07:38:45,787] INFO Client will use DIGEST-MD5 as SASL mechanism. (org.apache.zookeeper.client.ZooKeeperSaslClient)
[2022-03-16 07:38:45,787] INFO Opening socket connection to server dc2kafkabr2/A.B.C.72:2181. Will attempt to SASL-authenticate using Login Context section 'Client' (org.apache.zookeeper.ClientCnxn)
[2022-03-16 07:38:45,788] INFO Socket error occurred: dc2kafkabr2/A.B.C.72:2181: Connection refused (org.apache.zookeeper.ClientCnxn)
[2022-03-16 07:38:46,503] INFO Client successfully logged in. (org.apache.zookeeper.Login)
[2022-03-16 07:38:46,503] INFO Client will use DIGEST-MD5 as SASL mechanism. (org.apache.zookeeper.client.ZooKeeperSaslClient)
[2022-03-16 07:38:46,503] INFO Opening socket connection to server dc1kafkabr1/A.B.C.68:2181. Will attempt to SASL-authenticate using Login Context section 'Client' (org.apache.zookeeper.ClientCnxn)
[2022-03-16 07:38:46,504] INFO Socket connection established, initiating session, client: /A.B.C.68:35796, server: dc1kafkabr1/A.B.C.68:2181 (org.apache.zookeeper.ClientCnxn)
[2022-03-16 07:38:46,505] INFO Unable to read additional data from server sessionid 0x40000004f930002, likely server has closed socket, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2022-03-16 07:38:46,616] INFO Client successfully logged in. (org.apache.zookeeper.Login)
[2022-03-16 07:38:46,617] INFO Client will use DIGEST-MD5 as SASL mechanism. (org.apache.zookeeper.client.ZooKeeperSaslClient)
[2022-03-16 07:38:46,617] INFO Opening socket connection to server dc1kafkabr2/A.B.C.69:2181. Will attempt to SASL-authenticate using Login Context section 'Client' (org.apache.zookeeper.ClientCnxn)
[2022-03-16 07:38:46,617] INFO Socket connection established, initiating session, client: /A.B.C.68:38936, server: dc1kafkabr2/A.B.C.69:2181 (org.apache.zookeeper.ClientCnxn)
[2022-03-16 07:38:46,619] INFO Unable to read additional data from server sessionid 0x200ab15af610000, likely server has closed socket, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2022-03-16 07:38:46,896] INFO Client successfully logged in. (org.apache.zookeeper.Login)
However when the Kafka nodes will be stopped normally/humanly in DC2 by systemctl command then Kafka cluster works properly on the nodes in DC1.
The question is why if DC2 is turned off, the Kafka cluster stops working? How to prevent of it? Any idea?
Best Regards,
Dan
Dears,
After next tests I know that problem is by side of Zookeeper because when I trun off two brokers in DC2 the Kafka cluster still works. After turn off kafka.service on the last broker in DC2 the Kafka cluster still works. But when I turn off zookeeper.service on the last broker in DC2 the cluster becomes unresponsive.
This is my zookeeper's configuration:
cat zookeeper.properties
tickTime=2000
dataDir=/opt/zookeeper/data
#dataLogDir=/var/log/zookeeper
clientPort=2181
initLimit=5
syncLimit=3
############## HARDENING #################
authProvider.1=org.apache.zookeeper.server.auth.SASLAuthenticationProvider
requireClientAuthScheme=sasl
###########################################
server.1=A.B.C.68:2888:3888
server.2=A.B.C.69:2888:3888
server.3=A.B.C.70:2888:3888
server.4=A.B.C.71:2888:3888
server.5=A.B.C.72:2888:3888
server.6=A.B.C.73:2888:3888
Any idea what is wrong in this configuration?
Best Regards,
Dan
Zookeeper quorum is not ensure and this is reason.
I have setup Kafka 3-node cluster and Zookeeper 3-node cluster, on separate nodes. Using Kafka I can produce and consume messages successfully and run commands like kafka-topic.sh to get topic lists and their informations from Zookeeper, but there are some errors on Kafka server.log file. The following warning appears continuously:
[2018-02-18 21:50:01,241] WARN Client session timed out, have not heard from server in 320190154ms for sessionid 0x161a94b101f0001 (org.apache.zookeeper.ClientCnxn)
[2018-02-18 21:50:01,242] INFO Client session timed out, have not heard from server in 320190154ms for sessionid 0x161a94b101f0001, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2018-02-18 21:50:01,343] INFO zookeeper state changed (Disconnected) (org.I0Itec.zkclient.ZkClient)
[2018-02-18 21:50:01,989] INFO Opening socket connection to server zookeeper3/192.168.1.206:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2018-02-18 21:50:02,008] INFO Socket connection established to zookeeper3/192.168.1.206:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2018-02-18 21:50:02,042] INFO Session establishment complete on server zookeeper3/192.168.1.206:2181, sessionid = 0x161a94b101f0001, negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn)
[2018-02-18 21:50:02,042] INFO zookeeper state changed (SyncConnected) (org.I0Itec.zkclient.ZkClient)
[2018-02-18 21:59:31,570] INFO [Group Metadata Manager on Broker 102]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
It seems the Kafka sessions in zookeeper expires periodically!
In Zookeeper logs are the following warninngs, too:
2018-02-18 18:20:06,149 [myid:1] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#368] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x161a94b101f0001, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:239)
at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:203)
at java.lang.Thread.run(Thread.java:748)
2018-02-18 18:20:06,151 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#1044] - Closed socket connection for client /192.168.1.203:43162 which had sessionid 0x161a94b101f0001
2018-02-18 18:20:06,781 [myid:1] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#368] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x161a94b101f0002, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:239)
at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:203)
at java.lang.Thread.run(Thread.java:748)
2018-02-18 18:20:06,782 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#1044] - Closed socket connection for client /192.168.1.201:45330 which had sessionid 0x161a94b101f0002
2018-02-18 18:37:29,127 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory#192] - Accepted socket connection from /192.168.1.202:52480
2018-02-18 18:37:29,139 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer#942] - Client attempting to establish new session at /192.168.1.202:52480
2018-02-18 18:37:29,143 [myid:1] - INFO [CommitProcessor:1:ZooKeeperServer#687] - Established session 0x161a94b101f0003 with negotiated timeout 30000 for client /192.168.1.202:52480
2018-02-18 18:37:29,432 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#1044] - Closed socket connection for client /192.168.1.202:52480 which had sessionid 0x161a94b101f0003
I think it's because zookeeper can't get heartbeat from Kafka nodes. The followings are Zookeeper zoo.cfg:
tickTime=2000
dataDir=/var/zookeeper/
clientPort=2181
initLimit=5
syncLimit=2
server.1=zookeeper1:2888:3888
server.2=zookeeper2:2888:3888
server.3=zookeeper3:2888:3888
and Kafka server.properties customized setting:
broker.id=1
listeners = PLAINTEXT://kafka1:9092
num.partitions=24
delete.topic.enable=true
default.replication.factor=2
log.dirs=/data/kafka/data
zookeeper.connect=zookeeper1:2181,zookeeper2:2181,zookeeper3:2181
log.retention.hours=168
I use the same zookeeper cluster for Hadoop HA without any problem. I think there is something wrong with the Kafka properties listeners and advertised.listeners. I read the Kafka documentation but couldn't understand their meaning.
In the host file of all OSes, hostnames such that zookeeper1 to zookeeper3 and kafka1 to kafka3 are defined and reachable through ping command. I removed the following lines from hosts:
127.0.0.1 localhost
127.0.1.1 hostname
I think it couldn't cause the problem.
Kafka version: 0.11
Zookeeper version: 3.4.10
Can anyone help?
We were facing a similar issue with Kafka. As #Soheil pointed out it was due to a Major GC running.
When a Major GC runs, then Kafka would sometimes not be able to send heartbeat to zookeeper. For us the Major GC was running almost once every 15 sec. On taking a heap dump, we realized it was due to a Metric Memory Leak in Kafka.
I am running a 5 node kafka broker with a 5 node zookeeper ensemble.
On Kafka nodes I keep getting this error
[2016-11-14 19:05:11,345] INFO Opening socket connection to server 10.105.23.188/10.105.23.188:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2016-11-14 19:05:11,345] INFO Socket connection established to 10.105.23.188/10.105.23.188:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2016-11-14 19:05:11,346] INFO Unable to read additional data from server sessionid 0x55861d8ea6f0000, likely server has closed socket, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2016-11-14 19:05:11,721] INFO Opening socket connection to server 10.105.25.4/10.105.25.4:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2016-11-14 19:05:11,721] INFO Socket connection established to 10.105.25.4/10.105.25.4:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2016-11-14 19:05:11,722] INFO Unable to read additional data from server sessionid 0x55861d8ea6f0000, likely server has closed socket, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2016-11-14 19:05:11,841] INFO Opening socket connection to server 10.105.24.4/10.105.24.4:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2016-11-14 19:05:11,841] WARN Session 0x55861d8ea6f0000 for server null, unexpected error, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
[2016-11-14 19:05:12,319] INFO Opening socket connection to server 10.105.27.23/10.105.27.23:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2016-11-14 19:05:12,319] INFO Socket connection established to 10.105.27.23/10.105.27.23:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2016-11-14 19:05:12,320] INFO Unable to read additional data from server sessionid 0x55861d8ea6f0000, likely server has closed socket, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2016-11-14 19:05:12,395] WARN [ReplicaFetcherThread-0-5], Error in fetch kafka.server.ReplicaFetcherThread$FetchRequest#3fbe518f. Possible cause: org.apache.kafka.common.protocol.types.SchemaException: Error reading field 'responses': Error reading array of size 1204063, only 920 bytes available (kafka.server.ReplicaFetcherThread)
On Zookeeper nodes :
I get the following on the leader
2016-11-14 19:14:47,502 [myid:4] - ERROR [LearnerHandler-/10.105.25.4:50099:LearnerHandler#631] - Unexpected exception causing shutdown while sock still open
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99)
at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:513)
When a consumer tries to read from this cluster it get this error repeatedly
2016-11-14 19:17:36,847 WARN {main} [kafka.clients.NetworkClient$DefaultMetadataUpdater:handleResponse:629] Error while fetching metadata with correlation id 1603 : {=UNKNOWN_TOPIC_OR_PARTITION}
Please help me debug this. The consumers have auto commit enabled. If you need me to provide more configuration please let me know.
Thanks!
I'm running Kafka (0.10.0.0) in a Docker on a Mac (w/docker-machine). I derived my Dockerfile from Spotify's, which means Kafka and Zookeeper run in the same image.
My instance starts cleanly and poking around inside it appears everything is normal/OK.
Docker maps ports 2181 and 9092 to high-ports 32822 and 32820 in this case. From outside my running Kafka Docker I am able to successfully telnet 192.168.99.100 32822 (where 192.168.99.100 is the IP of my docker-machine). From there I can issue a zookeeper command and get expected output.
It all seems so encouraging, but... I then try this code:
val numPartitions = 4
val replicationFactor = 1
val topicConfig = new java.util.Properties
// zookeeper = "192.168.99.100:32822"
val zkClient = ZkUtils(zookeeper, 10000, 10000, false)
try {
AdminUtils.createTopic(zkClient, topic, numPartitions, replicationFactor, topicConfig)
} catch {
case k: kafka.common.TopicExistsException => // do nothing...topic exists
}
zkClient.close()
This produces this error output:
DEBUG ZkConnection - Creating new ZookKeeper instance to connect to 192.168.99.100:32822.
INFO ZkEventThread - Starting ZkClient event thread.
INFO ZooKeeper - Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT
INFO ZooKeeper - Client environment:host.name=172.25.42.82
INFO ZooKeeper - Client environment:java.version=1.8.0_60
INFO ZooKeeper - Client environment:java.vendor=Oracle Corporation
INFO ZooKeeper - Client environment:java.home=/Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/jre
INFO ZooKeeper - Client environment:java.class.path=/usr/local/Cellar/sbt/0.13.11/libexec/sbt-launch.jar
INFO ZooKeeper - Client environment:java.library.path=/Users/wmy965/Library/Java/Extensions:/Library/Java/Extensions:/Network/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java:.
INFO ZooKeeper - Client environment:java.io.tmpdir=/var/folders/ph/ccz4n1qs62n0bn8mqdg94gswt1jlwk/T/
INFO ZooKeeper - Client environment:java.compiler=<NA>
INFO ZooKeeper - Client environment:os.name=Mac OS X
INFO ZooKeeper - Client environment:os.arch=x86_64
INFO ZooKeeper - Client environment:os.version=10.11.5
INFO ZooKeeper - Client environment:user.name=wmy965
INFO ZooKeeper - Client environment:user.home=/Users/wmy965
INFO ZooKeeper - Client environment:user.dir=/Users/wmy965/git/LateKafka
INFO ZooKeeper - Initiating client connection, connectString=192.168.99.100:32822 sessionTimeout=10000 watcher=org.I0Itec.zkclient.ZkClient#55397e3
DEBUG ClientCnxn - zookeeper.disableAutoWatchReset is false
DEBUG ZkClient - Awaiting connection to Zookeeper server
INFO ZkClient - Waiting for keeper state SyncConnected
INFO ClientCnxn - Opening socket connection to server 192.168.99.100/192.168.99.100:32822. Will not attempt to authenticate using SASL (unknown error)
WARN ClientCnxn - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
DEBUG ClientCnxnSocketNIO - Ignoring exception during shutdown input
java.nio.channels.ClosedChannelException
at sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:780)
at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:399)
at org.apache.zookeeper.ClientCnxnSocketNIO.cleanup(ClientCnxnSocketNIO.java:200)
at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1185)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1110)
DEBUG ClientCnxnSocketNIO - Ignoring exception during shutdown output
java.nio.channels.ClosedChannelException
at sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:797)
at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:407)
at org.apache.zookeeper.ClientCnxnSocketNIO.cleanup(ClientCnxnSocketNIO.java:207)
at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1185)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1110)
INFO ClientCnxn - Opening socket connection to server 192.168.99.100/192.168.99.100:32822. Will not attempt to authenticate using SASL (unknown error)
INFO ClientCnxn - Socket connection established to 192.168.99.100/192.168.99.100:32822, initiating session
DEBUG ClientCnxn - Session establishment request sent on 192.168.99.100/192.168.99.100:32822
INFO ClientCnxn - Session establishment complete on server 192.168.99.100/192.168.99.100:32822, sessionid = 0x155225c51720000, negotiated timeout = 10000
DEBUG ZkClient - Received event: WatchedEvent state:SyncConnected type:None path:null
INFO ZkClient - zookeeper state changed (SyncConnected)
DEBUG ZkClient - Leaving process event
DEBUG ZkClient - State is SyncConnected
DEBUG ClientCnxn - Reading reply sessionid:0x155225c51720000, packet:: clientPath:null serverPath:null finished:false header:: 1,8 replyHeader:: 1,1,-101 request:: '/brokers/ids,F response:: v{}
It looks like I can't connect (presumably to zookeeper). Why not?
In new kafka streams, the ip of producer must have been knowing by kafka (docker). Kafka send their uuid (you can show this in /etc/hosts inside kafka docker) and espect response from this.
Summary:
Map uuid kafka docker to docker-machine in /etc/host of mac OS.
To help you, how to change etc/host file in mac:
https://www.tekrevue.com/tip/edit-hosts-file-mac-os-x/
Cleaner would be to set advertised.listeners=host-ip:port since advertised.host.name and advertised.port are deprecated in Kafka server.properties file.
If set host-ip to 0.0.0.0 it will listen requests from anywhere. But it's insecure.
I am running 3 instances of ZooKeeper and the config is this:
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/tmp/zookeeper1
clientPort=2181
maxClientCnxns=1000
server.1=127.0.0.1:2888:3888
server.2=127.0.0.1:2889:3889
server.3=127.0.0.1:2890:3890
I am using the leader election example code given here:
https://git-wip-us.apache.org/repos/asf?p=curator.git;a=tree;f=curator-examples/src/main/java/leader;h=73b547eadb98995c0ccbd06a5b76d0741ffef263;hb=HEAD
The code runs fine with TestingServer but when I change connection string to : "127.0.0.1:2181,127.0.0.1:2182,127.0.0.1:2183", I get the exceptions:
[main-SendThread(127.0.0.1:2183)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server 127.0.0.1/127.0.0.1:2183. Will not attempt to authenticate using SASL (unknown error)
[main-SendThread(127.0.0.1:2183)] INFO org.apache.zookeeper.ClientCnxn - Socket connection established, initiating session, client: /127.0.0.1:56111, server: 127.0.0.1/127.0.0.1:2183
[main-SendThread(127.0.0.1:2183)] INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on server 127.0.0.1/127.0.0.1:2183, sessionid = 0x3521552283c0000, negotiated timeout = 40000
[main-EventThread] INFO org.apache.curator.framework.state.ConnectionStateManager - State change: CONNECTED
[main-SendThread(127.0.0.1:2183)] INFO org.apache.zookeeper.ClientCnxn - Unable to read additional data from server sessionid 0x3521552283c0000, likely server has closed socket, closing socket connection and attempting reconnect
[main-EventThread] INFO org.apache.curator.framework.imps.EnsembleTracker - New config event received: null
[main-EventThread] ERROR org.apache.curator.framework.imps.CuratorFrameworkImpl - Background exception was not retry-able or retry gave up
java.lang.NullPointerException
at java.io.ByteArrayInputStream.<init>(ByteArrayInputStream.java:106)
at org.apache.curator.framework.imps.EnsembleTracker.processConfigData(EnsembleTracker.java:163)
at org.apache.curator.framework.imps.EnsembleTracker.access$200(EnsembleTracker.java:48)
at org.apache.curator.framework.imps.EnsembleTracker$2.processResult(EnsembleTracker.java:134)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.sendToBackgroundCallback(CuratorFrameworkImpl.java:829)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.processBackgroundOperation(CuratorFrameworkImpl.java:611)
at org.apache.curator.framework.imps.WatcherRemovalFacade.processBackgroundOperation(WatcherRemovalFacade.java:151)
at org.apache.curator.framework.imps.GetConfigBuilderImpl$2.processResult(GetConfigBuilderImpl.java:210)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:619)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:528)
[main-EventThread] INFO org.apache.curator.framework.state.ConnectionStateManager - State change: SUSPENDED
What could be the issue?
I am hitting the same issue. I think it might be related to the Zookeeper 3.5.1 ClientCnxn. Even though I return back to curator 2.6.0, I still see the same stack trace. A GET_CONFIG event type is sent without the event data.
My stack trace looks like this:
org.apache.curator.framework.imps.CuratorFrameworkImpl: Background exception was not retry-able or retry gave up
! java.lang.NullPointerException: null
! at java.io.ByteArrayInputStream.(ByteArrayInputStream.java:106)
! at org.apache.curator.framework.imps.EnsembleTracker.processConfigData(EnsembleTracker.java:163)
! at org.apache.curator.framework.imps.EnsembleTracker.access$200(EnsembleTracker.java:48)
! at org.apache.curator.framework.imps.EnsembleTracker$2.processResult(EnsembleTracker.java:134)
! at org.apache.curator.framework.imps.CuratorFrameworkImpl.sendToBackgroundCallback(CuratorFrameworkImpl.java:829)
! at org.apache.curator.framework.imps.CuratorFrameworkImpl.processBackgroundOperation(CuratorFrameworkImpl.java:611)
! at org.apache.curator.framework.imps.WatcherRemovalFacade.processBackgroundOperation(WatcherRemovalFacade.java:151)
! at org.apache.curator.framework.imps.GetConfigBuilderImpl$2.processResult(GetConfigBuilderImpl.java:210)
! at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:619)
! at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:528)
If use Zookeeper 3.5.1, then curator-recipes 3.2.1+ fix this issue.