Cannot start Zookeeper due to: Exception causing close of session 0x0 due to java.io.EOFException - apache-kafka

I am trying to start up Zookeeper via the CLI with the command:
bin/zookeeper-server-start.sh ../config/zookeeper.properties
And it hums along for a second with what all seems to be correct until it says this:
INFO binding to port 0.0.0.0/0.0.0.0:2181 (org.apache.zookeeper.server.NIOServerCnxnFactory)
and then the below loops indefinitely until I exit:
[2018-08-10 15:07:48,223] INFO Accepted socket connection from /172.31.39.32:46374 (org.apache.zookeeper.server.NIOServerCnxnFactory)
[2018-08-10 15:07:48,228] WARN Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running (org.apache.zookeeper.server.NIOServerCnxn)
[2018-08-10 15:07:48,228] INFO Closed socket connection for client /172.31.39.32:46374 (no session established for client) (org.apache.zookeeper.server.NIOServerCnxn)
This is a single server and I believe a single node test server, so there isn't a quorum or other pieces running. My zookeeper config is basic, it only contains this:
dataDir=/tmp/zookeeper
clientPort=2181
maxClientCnxns=0
The weird thing is, my zookeeper had been running fine, and I had made NO changes to the config. Pulled it down to try to fix something else to do a quick restart on the zookeeper, and it won't budge. I've checked, and nothing else is running on port 2181.
I see this question has been asked several times with no answers, any ideas?

This might be happening because of some corruption in zookeeper data. You should not set dataDir to /tmp/*. If your computer purges some data of /tmp, it will be difficult for zookeeper to restore the state upon restart. If you check the zookeeper logs, you should see some kind of exception there.
Since you mentioned this zookeeper instance is for test purpose only. You should set
dataDir to anything but /tmp and try restart.

Related

Kafka is failing to start. Getting the below error

ERROR Error while creating ephemeral at /brokers/ids/0, node already exists and owner '72067757872119809' does not match current session '72067836689711106' (kafka.zk.KafkaZkClient$CheckedEphemeral) 2021-05-05 02:19:44.796 [INF] [Kafka] [2021-05-05 02:19:44,786] ERROR [KafkaServer id=0] Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
The error is saying a broker already is running with id=0, or that Zookeeper is corrupt because a broker did not previously cleanly shut down...
In the later case, you can attempt to use zookeeper-shell to rmr /brokers/ids/0, however, this might have more unintended consequences than preforming a restart of Zookeeper as well as the brokers
The only solution which works here is restart the broker and then restart the server.
Restart the zookeeper and the broker is fixed the issue for me.
If you using the docker-compose, you can restart simply by using
docker-compose restart

Zookeeper error: Exception causing close of session 0x0 due to java.io.IOException: Len error

We have a well configured zookeeper and kafka cluster nodes. The manual test for creation a topic and sending a message on that topic passed successfully. But when I run a test from a test equipment in order to create a topic with MQTT protocol, I receive:
Exception causing close of session 0x0 due to java.io.IOException: Len error 271056900
[myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#1008] - Closed socket connection for client /192.18.0.1:15659 (no session established for client).
Can someone give me some hint on how to solve this issue?
Looks like you are exceeding your jute.maxbuffer. Try to increase it. Here you can find some more information.
If you are using docker-compose, this helps me:
environment:
KAFKA_OPTS: -Djute.maxbuffer=500000000

Confluent schema registry fails on start with NoSuchMethodError

Exception in thread "main" java.lang.NoSuchMethodError: io.confluent.rest.Application.parseListeners(Ljava/util/List;ILjava/util/List;Ljava/lang/String;)Ljava/util/List;
at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.getPortForIdentity(KafkaSchemaRegistry.java:204)
at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.(KafkaSchemaRegistry.java:133)
etc/schema-registry/schema-registry.properties
listeners=http://0.0.0.0:8081
kafkastore.connection.url=localhost:2181
kafkastore.topic=_schemas
debug=false
kafka and zookeeper are already running.
Why logs from zookeeper keep on coming like
[2017-10-17 09:57:31,352] INFO Accepted socket connection from /13.**.**.***:39572 (org.apache.zookeeper.server.NIOServerCnxnFactory)
[2017-10-17 09:57:31,352] WARN Exception causing close of session 0x0 due to java.io.EOFException (org.apache.zookeeper.server.NIOServerCnxn)
[2017-10-17 09:57:31,352] INFO Closed socket connection for client /13.58.108.150:39572 (no session established for client) (org.apache.zookeeper.server.NIOServerCnxn)
[2017-10-17 09:57:31,438] INFO Accepted socket connection from /13.**.**.***:39574 (org.apache.zookeeper.server.NIOServerCnxnFactory)
[2017-10-17 09:57:31,438] WARN Exception causing close of session 0x0 due to java.io.EOFException (org.apache.zookeeper.server.NIOServerCnxn)
[2017-10-17 09:57:31,438] INFO Closed socket connection for client /13.**.***.**:39574 (no session established for client) (org.apache.zookeeper.server.NIOServerCnxn)
I was wondering maybe this will be the cause of failure for schema-registry.
Any suggestions.
NoSuchMethodError indicates your CLASSPATH is misconfigured.
It's not clear what version you're running or what OS you're using but Windows is not officially supported, and pater versions of Confluent Platform have likely fixed this, or using the Docker images should work as well
in my situation the problem was caused by hostname, check if hostname is equal to "localhost"
Problem "Scheme registry fail on start"
Test Solution "set Hostname to "localhost""
If this solve your problem, you can config permantly yout hostname:
modify file /etc/hostname

Zookeeper refuses Kafka connection from an old client

I have a cluster configuration using Kubernetes on GCE, I have a pod for zookeeper and other for Kafka; it was working normally until Zookeeper get crashed and restarted, and it start refusing connections from the kafka pod:
Refusing session request for client /10.4.4.58:52260 as it has seen
zxid 0x1962630
The complete refusal log is here:
2017-08-21 20:05:32,013 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory#192] - Accepted socket connection from /10.4.4.58:52260
2017-08-21 20:05:32,013 [myid:] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer#882] - Connection request from old client /10.4.4.58:52260; will be dropped if server is in r-o mode
2017-08-21 20:05:32,013 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer#901] - Refusing session request for client /10.4.4.58:52260 as it has seen zxid 0x1962630 our last zxid is 0xab client must try another server
2017-08-21 20:05:32,013 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#1008] - Closed socket connection for client /10.4.4.58:52260 (no session established for client)
Because the kafka maintain a zookeeper session which remember the last zxid it has seen. So when the zookeeper sevice go down and come again, the zk's zxid begin from a smaller value. and ZKserver think the kafka has seen a bigger zxid, so it refuse it.
Have a try to restart the kafka.
For the record, I had this problem and all my kafka were off.
But, my kafka-manager was still up and listening on zookeepers. Turning it off resolved the issue.
Related to the answer from #GuangshengZuo.... Steps
Stop any residual zookeeper instances - zookeeper-server-stop.bat
Start a fresh zookeeper- zookeeper-server-start.bat .\config\zookeeper.properties
This will do

testing kafka consumer and producer failed on connection

I have been trying to test a kafka installation and using the guide created a producer and consumer. When trying to retrieve a message I get the following error:
WARN Session 0x0 for server null, unexpected error, closing socket connection and
attempting reconnect (org.apache.zookeeper.ClientCnxn)
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1146)
[2014-03-04 18:01:20,628] INFO Terminate ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread)
[2014-03-04 18:01:21,315] INFO Opening socket connection to server kafka-test/192.xxxxxx.110:2182 (org.apache.zookeeper.ClientCnxn)
[2014-03-04 18:01:21,418] INFO Session: 0x0 closed (org.apache.zookeeper.ZooKeeper)
Exception in thread "main" org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 6000
at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:880)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:98)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:84)
at kafka.consumer.ZookeeperConsumerConnector.connectZk(ZookeeperConsumerConnector.scala:151)
at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:112)
at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:123)
at kafka.consumer.Consumer$.create(ConsumerConnector.scala:89)
at kafka.consumer.ConsoleConsumer$.main(ConsoleConsumer.scala:178)
at kafka.consumer.ConsoleConsumer.main(ConsoleConsumer.scala)
[2014-03-04 18:01:21,419] INFO EventThread shut down (org.apache.zookeeper.ClientCnxn)
Kafka
Looks like you're not connecting to Zookeeper correctly. I'm not sure of your setup (multi-machine, VMs, containers) so it's hard to say what's wrong. From the debug output I see the following line hinting at your expected Zookeeper IP:
[2014-03-04 18:01:21,315] INFO Opening socket connection to server kafka-test/192.xxxxxx.110:2182 (org.apache.zookeeper.ClientCnxn)
Kafka looks for Zookeeper at the address specified by the zookeeper.connect configuration property in the $KAFKA_HOME/config/server.properties file. Be sure to edit that before starting Kafka. Also, try giving the actual public IP of your Zookeeper instance, not just 127.0.0.1 as that solves a lot of confusion if you're running in containers. In your case it looks like it would be:
zookeeper.connect=192.xxxxxx.110:2182
Also relevant to the Kafka config if you're running on AWS or operating in a container, don't forget to update the following two configuration properties to make sure clients who connect to Kafka see the correct public IP
advertised.host.name
advertised.port
and Kafka sees the correct internal IP
host.name
port
Zookeeper
Zookeeper has some gotchas when setting it up as well. On your Zookeeper instance, don't forget to edit the server configuration property in the zoo.cfg (usually in /etc/zookeeper/conf) file to point to the correct IP for your Zookeeper instance. In your case probably the following:
server.1=192.xxxxxx.110:2888:3888
Those last two ports (2888 3888) are only needed if you're running a Zookeeper cluster (for followers to connect to the leader and Zookeeper leader election, respectively, so be sure to unblock them on firewallish things if you have multiple Zookeeper servers).
Check your zookeeper connection with telnet command:
telnet 192.xxxxxx.110 2181
You probably get an error, in which case check that the process is running:
ps -ef | grep "zookeeper.properties"
If it's not running, start it by going into kafka home directory:
bin/zookeeper-server-start.sh config/zookeeper.properties &
Something wrong with your Zookeper configuration. Make sure your zookeeper is up and running. The default port it runs on is 2181
Bit more info and some code could be useful I believe.
I hit the same issue and the problem was the max client connections property in zookeeper config.
if you see something like "maxClientCnxns = 20" in the config file in /etc/zookeeper/conf, comment it out and restart zookeeper.
You may also check if the all the connections available have already been exhausted. If you are using an API to connect to ZK, make sure you free up the connection after you're done.
I also meet the problem. When I shutdown the firewall of the zk node, it will work.