Zookeeper Refusing session request for client /0:0:0:0:0:0:0:1:59376 as it has seen zxid 0x124 our last zxid is 0x52 client must try another server - apache-zookeeper

Running zookeeper with telegraph continue to get the following errors:
[2021-04-05 15:00:58,881] INFO Refusing session request for client /0:0:0:0:0:0:0:1:59376 as it has seen zxid 0x124 our last zxid is 0x52 client must try another server (org.apache.zookeeper.server.ZooKeeperServer)
[2021-04-05 15:01:00,346] INFO Refusing session request for client /0:0:0:0:0:0:0:1:59378 as it has seen zxid 0x124 our last zxid is 0x52 client must try another server (org.apache.zookeeper.server.ZooKeeperServer)
[2021-04-05 15:01:02,178] INFO Refusing session request for client /0:0:0:0:0:0:0:1:59380 as it has seen zxid 0x124 our last zxid is 0x52 client must try another server (org.apache.zookeeper.server.ZooKeeperServer)
[2021-04-05 15:01:03,362] INFO Refusing session request for client /0:0:0:0:0:0:0:1:59382 as it has seen zxid 0x124 our last zxid is 0x52 client must try another server (org.apache.zookeeper.server.ZooKeeperServer)
[2021-04-05 15:01:04,658] INFO Refusing session request for client /127.0.0.1:57084 as it has seen zxid 0x124 our last zxid is 0x52 client must try another server (org.apache.zookeeper.server.ZooKeeperServer)
[2021-04-05 15:01:06,329] INFO Refusing session request for client /127.0.0.1:57086 as it has seen zxid 0x124 our last zxid is 0x52 client must try another server (org.apache.zookeeper.server.ZooKeeperServer)
[2021-04-05 15:01:07,751] INFO Refusing session request for client /0:0:0:0:0:0:0:1:59388 as it has seen zxid 0x124 our last zxid is 0x52 client must try another server (org.apache.zookeeper.server.ZooKeeperServer)
[2021-04-05 15:01:09,174] INFO Refusing session request for client /127.0.0.1:57090 as it has seen zxid 0x124 our last zxid is 0x52 client must try another server (org.apache.zookeeper.server.ZooKeeperServer)
[2021-04-05 15:01:10,645] INFO Refusing session request for client /127.0.0.1:57092 as it has seen zxid 0x124 our last zxid is 0x52 client must try another server (org.apache.zookeeper.server.ZooKeeperServer)
[2021-04-05 15:01:12,682] INFO Refusing session request for client /127.0.0.1:57094 as it has seen zxid 0x124 our last zxid is 0x52 client must try another server (org.apache.zookeeper.server.ZooKeeperServer)
[2021-04-05 15:01:14,216] INFO Refusing session request for client /127.0.0.1:57096 as it has seen zxid 0x124 our last zxid is 0x52 client must try another server (org.apache.zookeeper.server.ZooKeeperServer)

I had the same issue. I ended up solving it by terminating all kafka clients i had. Not only containers but an UI as well. I was using Conduktor.

Related

Zookeeper does not start normally (Established session 0x10000025c8a0001 with negotiated timeout 6000 )and kafka fails

I had previously run zookeeper and kafka successfully many times, and I believe my installation and configurations are correct.
The only change I made was to the zookeeper config file:
dataDir=/Users/garynackenson/Downloads/kafka_2.12-2.0.0/data/zookeeper
which I have created the directory for.
Now when I run zookeeper instead of getting info binding to port 0.0.0.0/0.0.0.0:2181
I get the error below, and kafka fails with a port 9092 in use error (i have restarted my machine and checked every way i know to see that port 9092 is not in use
the last message from zoopkeeper below, which does not look right
INFO Established session 0x10000025c8a0001 with negotiated timeout 6000 for client /127.0.0.1:49977 (org.apache.zookeeper.server.ZooKeeperServer)
When zookeeper starts that way kafka fails with a 9092 in use error (see below) – I restarted and checked that I am not using port 9092.
org.apache.kafka.common.KafkaException: Socket server failed to bind to 0.0.0.0:9092: Address already in use.
A little while later , I saw that zookeeper had a different issue:
INFO Closed socket connection for client /0:0:0:0:0:0:0:1:49986 which had sessionid 0x100000b679d0000 (org.apache.zookeeper.server.NIOServerCnxn)
I ran zookeeper again and saw the more ‘normal’ binding to 2081 4 messages up
[2018-10-03 18:25:08,064] INFO binding to port 0.0.0.0/0.0.0.0:2181 (org.apache.zookeeper.server.NIOServerCnxnFactory)
[2018-10-03 18:25:09,055] INFO Accepted socket connection from /127.0.0.1:50014 (org.apache.zookeeper.server.NIOServerCnxnFactory)
[2018-10-03 18:25:09,062] INFO Client attempting to renew session 0x10000025c8a0001 at /127.0.0.1:50014 (org.apache.zookeeper.server.ZooKeeperServer)
[2018-10-03 18:25:09,066] INFO Established session 0x10000025c8a0001 with negotiated timeout 6000 for client /127.0.0.1:50014 (org.apache.zookeeper.server.ZooKeeperServer)
but kafka is still failing every time
also sometimes i get the following message when i start zookeeper
[2018-10-03 18:10:36,097] INFO Got user-level KeeperException when processing sessionid:0x10000025c8a0001 type:delete cxid:0x47 zxid:0x179 txntype:-1 reqpath:n/a Error Path:/admin/preferred_replica_election Error:KeeperErrorCode = NoNode for /admin/preferred_replica_election (org.apache.zookeeper.server.PrepRequestProcessor)

Kafka Zookeeper Connection drop continuously

I have setup Kafka 3-node cluster and Zookeeper 3-node cluster, on separate nodes. Using Kafka I can produce and consume messages successfully and run commands like kafka-topic.sh to get topic lists and their informations from Zookeeper, but there are some errors on Kafka server.log file. The following warning appears continuously:
[2018-02-18 21:50:01,241] WARN Client session timed out, have not heard from server in 320190154ms for sessionid 0x161a94b101f0001 (org.apache.zookeeper.ClientCnxn)
[2018-02-18 21:50:01,242] INFO Client session timed out, have not heard from server in 320190154ms for sessionid 0x161a94b101f0001, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2018-02-18 21:50:01,343] INFO zookeeper state changed (Disconnected) (org.I0Itec.zkclient.ZkClient)
[2018-02-18 21:50:01,989] INFO Opening socket connection to server zookeeper3/192.168.1.206:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2018-02-18 21:50:02,008] INFO Socket connection established to zookeeper3/192.168.1.206:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2018-02-18 21:50:02,042] INFO Session establishment complete on server zookeeper3/192.168.1.206:2181, sessionid = 0x161a94b101f0001, negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn)
[2018-02-18 21:50:02,042] INFO zookeeper state changed (SyncConnected) (org.I0Itec.zkclient.ZkClient)
[2018-02-18 21:59:31,570] INFO [Group Metadata Manager on Broker 102]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
It seems the Kafka sessions in zookeeper expires periodically!
In Zookeeper logs are the following warninngs, too:
2018-02-18 18:20:06,149 [myid:1] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#368] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x161a94b101f0001, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:239)
at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:203)
at java.lang.Thread.run(Thread.java:748)
2018-02-18 18:20:06,151 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#1044] - Closed socket connection for client /192.168.1.203:43162 which had sessionid 0x161a94b101f0001
2018-02-18 18:20:06,781 [myid:1] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#368] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x161a94b101f0002, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:239)
at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:203)
at java.lang.Thread.run(Thread.java:748)
2018-02-18 18:20:06,782 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#1044] - Closed socket connection for client /192.168.1.201:45330 which had sessionid 0x161a94b101f0002
2018-02-18 18:37:29,127 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory#192] - Accepted socket connection from /192.168.1.202:52480
2018-02-18 18:37:29,139 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer#942] - Client attempting to establish new session at /192.168.1.202:52480
2018-02-18 18:37:29,143 [myid:1] - INFO [CommitProcessor:1:ZooKeeperServer#687] - Established session 0x161a94b101f0003 with negotiated timeout 30000 for client /192.168.1.202:52480
2018-02-18 18:37:29,432 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#1044] - Closed socket connection for client /192.168.1.202:52480 which had sessionid 0x161a94b101f0003
I think it's because zookeeper can't get heartbeat from Kafka nodes. The followings are Zookeeper zoo.cfg:
tickTime=2000
dataDir=/var/zookeeper/
clientPort=2181
initLimit=5
syncLimit=2
server.1=zookeeper1:2888:3888
server.2=zookeeper2:2888:3888
server.3=zookeeper3:2888:3888
and Kafka server.properties customized setting:
broker.id=1
listeners = PLAINTEXT://kafka1:9092
num.partitions=24
delete.topic.enable=true
default.replication.factor=2
log.dirs=/data/kafka/data
zookeeper.connect=zookeeper1:2181,zookeeper2:2181,zookeeper3:2181
log.retention.hours=168
I use the same zookeeper cluster for Hadoop HA without any problem. I think there is something wrong with the Kafka properties listeners and advertised.listeners. I read the Kafka documentation but couldn't understand their meaning.
In the host file of all OSes, hostnames such that zookeeper1 to zookeeper3 and kafka1 to kafka3 are defined and reachable through ping command. I removed the following lines from hosts:
127.0.0.1 localhost
127.0.1.1 hostname
I think it couldn't cause the problem.
Kafka version: 0.11
Zookeeper version: 3.4.10
Can anyone help?
We were facing a similar issue with Kafka. As #Soheil pointed out it was due to a Major GC running.
When a Major GC runs, then Kafka would sometimes not be able to send heartbeat to zookeeper. For us the Major GC was running almost once every 15 sec. On taking a heap dump, we realized it was due to a Metric Memory Leak in Kafka.

Zookeeper refuses Kafka connection from an old client

I have a cluster configuration using Kubernetes on GCE, I have a pod for zookeeper and other for Kafka; it was working normally until Zookeeper get crashed and restarted, and it start refusing connections from the kafka pod:
Refusing session request for client /10.4.4.58:52260 as it has seen
zxid 0x1962630
The complete refusal log is here:
2017-08-21 20:05:32,013 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory#192] - Accepted socket connection from /10.4.4.58:52260
2017-08-21 20:05:32,013 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer#882] - Connection request from old client /10.4.4.58:52260; will be dropped if server is in r-o mode
2017-08-21 20:05:32,013 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer#901] - Refusing session request for client /10.4.4.58:52260 as it has seen zxid 0x1962630 our last zxid is 0xab client must try another server
2017-08-21 20:05:32,013 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#1008] - Closed socket connection for client /10.4.4.58:52260 (no session established for client)
Because the kafka maintain a zookeeper session which remember the last zxid it has seen. So when the zookeeper sevice go down and come again, the zk's zxid begin from a smaller value. and ZKserver think the kafka has seen a bigger zxid, so it refuse it.
Have a try to restart the kafka.
For the record, I had this problem and all my kafka were off.
But, my kafka-manager was still up and listening on zookeepers. Turning it off resolved the issue.
Related to the answer from #GuangshengZuo.... Steps
Stop any residual zookeeper instances - zookeeper-server-stop.bat
Start a fresh zookeeper- zookeeper-server-start.bat .\config\zookeeper.properties
This will do

INFO Closed socket connection for client /127.0.0.1:48452 which had sessionid 0x15698f5ac360001 (org.apache.zookeeper.server.NIOServerCnxn)

I installed fresh zookeeper & kafka both. I started them both. Then when I want to see the list of topics with this command:
bin/kafka-topics.sh --list --zookeeper localhost 2181
It gives me the socket connection closed. Here is the screen shot:
darpanshah#darpan-ubuntu:/opt/Kafka$ bin/kafka-topics.sh --list --zookeeper localhost 2181
2016-08-17 10:44:44,053] INFO Accepted socket connection from /127.0.0.1:48452 (org.apache.zookeeper.server.NIOServerCnxnFactory)
[2016-08-17 10:44:44,059] INFO Client attempting to establish new session at /127.0.0.1:48452 (org.apache.zookeeper.server.ZooKeeperServer)
[2016-08-17 10:44:44,069] INFO Established session 0x15698f5ac360001 with negotiated timeout 30000 for client /127.0.0.1:48452 (org.apache.zookeeper.server.ZooKeeperServer)
[2016-08-17 10:44:44,095] INFO Processed session termination for sessionid: 0x15698f5ac360001 (org.apache.zookeeper.server.PrepRequestProcessor)
[2016-08-17 10:44:44,105] INFO Closed socket connection for client /127.0.0.1:48452 which had sessionid 0x15698f5ac360001 (org.apache.zookeeper.server.NIOServerCnxn)
darpanshah#darpan-ubuntu:/opt/Kafka$
Thanks in advance. Got stuck.
That is not an error. The topic details are fetched from Zookeeper. Hence the client (invoked by kafka-topics.sh) first connects to Zookeeper, then establishes a session, gets the data and then disconnects at the end.
This is the expected behavior of any clients that will get some data from Zookeeper.

How to programmatically detect which server in ensemble client is connected to?

How to programmatically detect which server in a ZooKeeper ensemble a client is connected to?
I'm using the Apache Curator API and I am listening for state changes in connection by registering ConnectionStateListener. I would like to know which server in the ensemble a client is connected to when the client reconnects if the server it was connected to goes down.
You can see this in the logs produced by Curator. In the example output below, the CuratorFramework client has been given 4 different ZooKeeper instances in the connectionString that it can connect to. As can be seen in the log, it choses the first:
21:13:45.384 [main] INFO org.apache.curator.framework.imps.CuratorFrameworkImpl - Starting
21:13:45.386 [main] INFO org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=127.0.0.1:2181,127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2184 sessionTimeout=60000 watcher=org.apache.curator.ConnectionState#2876f0c
21:13:45.388 [main-SendThread(127.0.0.1:2181)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
21:13:45.388 [main-SendThread(127.0.0.1:2181)] INFO org.apache.zookeeper.ClientCnxn - Socket connection established to 127.0.0.1/127.0.0.1:2181, initiating session
21:13:45.392 [main-SendThread(127.0.0.1:2181)] INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on server 127.0.0.1/127.0.0.1:2181, sessionid = 0x14aac461eb70004, negotiated timeout = 40000
In case the ZooKeeper server that the client has connected to crashes, you will also see the new server that the client connects to in the logs:
21:23:03.675 [main-SendThread(127.0.0.1:2182)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server 127.0.0.1/127.0.0.1:2182. Will not attempt to authenticate using SASL (unknown error)
21:23:03.677 [main-SendThread(127.0.0.1:2182)] INFO org.apache.zookeeper.ClientCnxn - Socket connection established to 127.0.0.1/127.0.0.1:2182, initiating session
21:23:03.697 [main-SendThread(127.0.0.1:2182)] INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on server 127.0.0.1/127.0.0.1:2182, sessionid = 0x14aac461eb70004, negotiated timeout = 40000
21:23:03.697 [main-EventThread] INFO org.apache.curator.framework.state.ConnectionStateManager - State change: RECONNECTED