kafka 0.10.1.1 stops responding sporadically - apache-kafka

We use kafka 0.10.1.1 and it is running fine for few hours and sometimes few days. All of sudden it starts giving the below exception and broker loses connection between them. The zookeeper and kafka server processes are running, but not accepting any connection.
We are running kafka and zookeeper on the same node and its a 2 nodes cluster setup. This is just our dev environment.
The below exception was observed in the server log.
2017-03-23 14:05:52,729] WARN [ReplicaFetcherThread-0-38], Error in fetch kafka.server.ReplicaFetcherThread$FetchRequest#1893e027 (kafka.server.ReplicaFetcherThread)
java.io.IOException: Connection to 2 was disconnected before the response was read
at kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:115)
at kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:112)
at scala.Option.foreach(Option.scala:257)
at kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:112)
at kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:108)
at kafka.utils.NetworkClientBlockingOps$.recursivePoll$1(NetworkClientBlockingOps.scala:137)
at kafka.utils.NetworkClientBlockingOps$.kafka$utils$NetworkClientBlockingOps$$pollContinuously$extension(NetworkClientBlockingOps.scala:143)
at kafka.utils.NetworkClientBlockingOps$.blockingSendAndReceive$extension(NetworkClientBlockingOps.scala:108)
at kafka.server.ReplicaFetcherThread.sendRequest(ReplicaFetcherThread.scala:253)
at kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:238)
at kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:42)
at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:118)
at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:103)
From Zookeeper.log,
[2017-03-22 22:42:15,600] INFO Client attempting to establish new session at /10.141.202.141:59930 (org.apache.zookeeper.server.ZooKeeperServer)
[2017-03-22 22:42:15,603] INFO Established session 0x25af82f142c0000 with negotiated timeout 6000 for client /11.121.102.441:59930 (org.apache.zookeeper.server.ZooKeeperServer)
[2017-03-22 22:42:29,322] INFO Got user-level KeeperException when processing sessionid:0x25af82f142c0000 type:create cxid:0x3a8 zxid:0x1e0000001a txntype:-1 reqpath:n/a Error Path:/brokers Error:KeeperErrorCode = NodeExists for /brokers (org.apache.zookeeper.server.PrepRequestProcessor)
[2017-03-22 22:42:29,329] INFO Got user-level KeeperException when processing sessionid:0x25af82f142c0000 type:create cxid:0x3a9 zxid:0x1e0000001b txntype:-1 reqpath:n/a Error Path:/brokers/ids Error:KeeperErrorCode = NodeExists for /brokers/ids (org.apache.zookeeper.server.PrepRequestProcessor)
[2017-03-22 22:42:32,943] INFO Accepted socket connection from /17.150.218.7:58233 (org.apache.zookeeper.server.NIOServerCnxnFactory)
[2017-03-22 22:42:32,944] WARN Connection request from old client /17.150.218.7:58233; will be dropped if server is in r-o mode (org.apache.zookeeper.server.ZooKeeperServer)
[2017-03-22 22:42:32,944] INFO Client attempting to establish new session at /17.150.218.7:58233 (org.apache.zookeeper.server.ZooKeeperServer)
[2017-03-22 22:42:32,947] INFO Established session 0x25af82f142c0001 with negotiated timeout 30000 for client /17.121.102.241:58233 (org.apache.zookeeper.server.ZooKeeperServer)
[2017-03-22 22:42:33,402] INFO Processed session termination for sessionid: 0x25af82f142c0001 (org.apache.zookeeper.server.PrepRequestProcessor)
[2017-03-22 22:42:33,405] INFO Closed socket connection for client /17.150.218.7:58233 which had sessionid 0x25af82f142c0001 (org.apache.zookeeper.server.NIOServerCnxn)
Thanks

Related

kafka.zookeeper.ZooKeeperClientTimeoutException: Timed out waiting for connection while in state: CONNECTING with remote host

I installed a Zookeeper and one Kafka broker server in one of my cloud server instances, and they are working well. But when trying to connect to the remote Zookeeper server, the Kafka broker is not able to reach that IP address and port number. The firewall is also in inactive mode.
The summary is:
one zookeeper server - in cloud instance [146.646.64.66*]
one Kafka broker server - in cloud instance [146.646.64.66*]
two Kafka broker server - in my local PC [localhost]
I have updated the zookeeper.connect property of the local Kafka broker server's property file as follows:
zookeeper.connect=146.646.64.66*:2181
The following is the error that the CMD shows:
[2021-06-17 19:47:01,443] INFO Initiating client connection, connectString=174.138.31.159:2181 sessionTimeout=18000 watcher=kafka.zookeeper.ZooKeeperClient$ZooKeeperClientWatcher$#6736fa8d (org.apache.zookeeper.ZooKeeper)
[2021-06-17 19:47:01,468] INFO jute.maxbuffer value is 4194304 Bytes (org.apache.zookeeper.ClientCnxnSocket)
[2021-06-17 19:47:01,545] INFO zookeeper.request.timeout value is 0. feature enabled= (org.apache.zookeeper.ClientCnxn)
[2021-06-17 19:47:01,553] INFO [ZooKeeperClient Kafka server] Waiting until connected. (kafka.zookeeper.ZooKeeperClient)
[2021-06-17 19:47:19,557] INFO [ZooKeeperClient Kafka server] Closing. (kafka.zookeeper.ZooKeeperClient)
[2021-06-17 19:47:21,663] INFO Opening socket connection to server 146.646.64.66*/146.646.64.66*:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2021-06-17 19:47:21,801] WARN Client session timed out, have not heard from server in 20251ms for sessionid 0x0 (org.apache.zookeeper.ClientCnxn)
[2021-06-17 19:47:21,929] INFO Session: 0x0 closed (org.apache.zookeeper.ZooKeeper)
[2021-06-17 19:47:21,929] INFO EventThread shut down for session: 0x0 (org.apache.zookeeper.ClientCnxn)
[2021-06-17 19:47:21,934] INFO [ZooKeeperClient Kafka server] Closed. (kafka.zookeeper.ZooKeeperClient)
[2021-06-17 19:47:21,944] ERROR Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
kafka.zookeeper.ZooKeeperClientTimeoutException: Timed out waiting for connection while in state: CONNECTING
at kafka.zookeeper.ZooKeeperClient.$anonfun$waitUntilConnected$3(ZooKeeperClient.scala:271)
at kafka.zookeeper.ZooKeeperClient.waitUntilConnected(ZooKeeperClient.scala:267)
at kafka.zookeeper.ZooKeeperClient.<init>(ZooKeeperClient.scala:125)
at kafka.zk.KafkaZkClient$.apply(KafkaZkClient.scala:1948)
at kafka.server.KafkaServer.createZkClient$1(KafkaServer.scala:431)
at kafka.server.KafkaServer.initZkClient(KafkaServer.scala:456)
at kafka.server.KafkaServer.startup(KafkaServer.scala:191)
at kafka.Kafka$.main(Kafka.scala:109)
at kafka.Kafka.main(Kafka.scala)
[2021-06-17 19:47:21,982] INFO shutting down (kafka.server.KafkaServer)
Please help me solve this problem.
remove all cached log files or change the directory of the log path of the server.properties file that you are going to run. the cache log files' data can be affected due to your server history.

Getting fatal error showing time out exception when trying to run kafka server

[2021-04-08 02:53:18,713] INFO [ZooKeeperClient Kafka server] Closing. (kafka.zookeeper.ZooKeeperClient)
[2021-04-08 02:53:33,182] WARN Client session timed out, have not heard from server in 18000ms for sessionid 0x0 (org.apache.zookeeper.ClientCnxn)
[2021-04-08 02:53:33,288] INFO Session: 0x0 closed (org.apache.zookeeper.ZooKeeper)
[2021-04-08 02:53:33,288] INFO EventThread shut down for session: 0x0 (org.apache.zookeeper.ClientCnxn)
[2021-04-08 02:53:33,290] INFO [ZooKeeperClient Kafka server] Closed. (kafka.zookeeper.ZooKeeperClient)
[2021-04-08 02:53:33,295] ****ERROR Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
kafka.zookeeper.ZooKeeperClientTimeoutException: Timed out waiting for connection while in state: CONNECTING**
at kafka.zookeeper.ZooKeeperClient.waitUntilConnected(ZooKeeperClient.scala:262)
at kafka.zookeeper.ZooKeeperClient.<init>(ZooKeeperClient.scala:119)
at kafka.zk.KafkaZkClient$.apply(KafkaZkClient.scala:1881)
at kafka.server.KafkaServer.createZkClient$1(KafkaServer.scala:441)
at kafka.server.KafkaServer.initZkClient(KafkaServer.scala:466)
at kafka.server.KafkaServer.startup(KafkaServer.scala:233)
at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:44)
at kafka.Kafka$.main(Kafka.scala:82)
at kafka.Kafka.main(Kafka.scala)
[2021-04-08 02:53:33,300] INFO shutting down (kafka.server.KafkaServer)
[2021-04-08 02:53:33,308] INFO App info kafka.server for 0 unregistered (org.apache.kafka.common.utils.AppInfoParser)
[2021-04-08 02:53:33,309] INFO shut down completed (kafka.server.KafkaServer)
[2021-04-08 02:53:33,310] ERROR Exiting Kafka. (kafka.server.KafkaServerStartable)
[2021-04-08 02:53:33,311] INFO shutting down (kafka.server.KafkaServer)
It seems that the zk server is not started. Kafka needs zookeeper, you need to download and start zookeeper before start kafka server.

Kafka Zookeeper Random Restarts

We are running Hyperledger fabric network with Kafka and zookeeper in production using docker swarm on Azure VM (4 Kafka node, 3 zookeeper nodes) it was running fine but just 2 days back suddenly zookeeper had a restart, after that there's continuous restart on zookeeper having time interval of 6-8 hours.
logs on Kafka node
[2020-07-04 07:48:53,492] INFO [ReplicaFetcher replicaId=2, leaderId=1, fetcherId=0] Stopped (kafka.server.ReplicaFetcherThread)
[2020-07-04 07:48:53,492] INFO [ReplicaFetcher replicaId=2, leaderId=1, fetcherId=0] Shutdown completed (kafka.server.ReplicaFetcherThread)
[2020-07-04 07:48:53,499] INFO [ReplicaFetcherManager on broker 2] Removed fetcher for partitions xxxx-xxxxx-xxx-xxxxx.
zookeeper leader logs
2020-07-04 07:46:27,070 [myid:3] - INFO [ProcessThread(sid:3 cport:-1)::PrepRequestProcessor#653] - Got user-level KeeperException when processing sessionid:0x10101beb22c0000 type:create cxid:0x4 zxid:0x2e00000114 txntype:-1 reqpath:n/a Error Path:/brokers/ids Error:KeeperErrorCode = NodeExists for /brokers/ids
2020-07-04 07:48:43,084 [myid:3] - INFO [SessionTracker:ZooKeeperServer#355] - Expiring session 0x2010551ef290000, timeout of 6000ms exceeded
2020-07-04 07:48:43,085 [myid:3] - INFO [ProcessThread(sid:3 cport:-1)::PrepRequestProcessor#487] - Processed session termination for sessionid: 0x2010551ef290000
2020-07-04 07:48:43,091 [myid:3] - INFO [CommitProcessor:3:NIOServerCnxn#1056] - Closed socket connection for client /100.0.20.80:60672 which had sessionid 0x2010551ef290000
2020-07-04 07:48:55,182 [myid:3] - ERROR [LearnerHandler-/100.0.20.80:58940:LearnerHandler#648] - Unexpected exception causing shutdown while sock still open
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:85)
at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99)
at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:559)
2020-07-04 07:48:55,183 [myid:3] - WARN [LearnerHandler-/100.0.20.80:58940:LearnerHandler#661] - ******* GOODBYE /100.0.20.80:58940 ********
2020-07-04 07:49:57,623 [myid:3] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory#222] - Accepted socket connection from /100.0.20.80:37838
2020-07-04 07:49:57,637 [myid:3] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer#949] - Client attempting to establish new session at /100.0.20.80:37838
2020-07-04 07:49:57,641 [myid:3] - INFO [CommitProcessor:3:ZooKeeperServer#694] - Established session 0x300ed4720900000 with negotiated timeout 12000 for client /100.0.20.80:37838
2020-07-04 07:49:57,670 [myid:3] - INFO [ProcessThread(sid:3 cport:-1)::PrepRequestProcessor#653] - Got user-level KeeperException when processing sessionid:0x300ed4720900000 type:setData cxid:0x1 zxid:0x2e000003b2 txntype:-1 reqpath:n/a Error Path:/brokers/topics/xxxxxxxxxxxx/partitions/0/state Error:KeeperErrorCode = BadVersion for /brokers/topics/xxxxxxxxxxxx/partitions/0/state
my zoo.cfg
clientPort=2181
dataDir=/data
dataLogDir=/datalog
tickTime=6000
initLimit=10
syncLimit=2
autopurge.snapRetainCount=3
autopurge.purgeInterval=1
server.1=xxx.xxx.com:2888:3888
server.2=xxx.xxx.com:2888:3888
server.3=0.0.0.0:2888:3888

Kafka consumer can not read from producer

My Kafka consumer can not read messages from producer.
$ jps
31700 Kafka
11243 Jps
31517 QuorumPeerMain
Feeding topic to producer :
Before edit :
$ bin/kafka-console-producer.sh --broker-list localhost:9092 --topic Hello_Kafka
this is my 1st message
this is my 2nd message
After edit :
$ bin/kafka-console-producer.sh --broker-list localhost:9092 --topic Hello-Kafka
this is my 1st message
this is my 2nd message
Open another terminal and trying to read data from consumer:
$ bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic Hello-Kafka --from-beginning
After this, I do not see any message on the terminal.
These are some logs which might help to debug :
$ [2017-02-20 08:23:15,000] INFO Expiring session 0x15a49f1f2530020, timeout of 30000ms exceeded (org.apache.zookeeper.server.ZooKeeperServer)
[2017-02-20 08:23:15,000] INFO Expiring session 0x15a49f1f2530022, timeout of 30000ms exceeded (org.apache.zookeeper.server.ZooKeeperServer)
[2017-02-20 08:23:15,000] INFO Expiring session 0x15a49f1f2530021, timeout of 30000ms exceeded (org.apache.zookeeper.server.ZooKeeperServer)
[2017-02-20 08:23:15,000] INFO Processed session termination for sessionid: 0x15a49f1f2530020 (org.apache.zookeeper.server.PrepRequestProcessor)
[2017-02-20 08:23:15,000] INFO Processed session termination for sessionid: 0x15a49f1f2530022 (org.apache.zookeeper.server.PrepRequestProcessor)
[2017-02-20 08:23:15,001] INFO Processed session termination for sessionid: 0x15a49f1f2530021 (org.apache.zookeeper.server.PrepRequestProcessor)
[2017-02-20 08:26:48,764] INFO Accepted socket connection from /127.0.0.1:45806 (org.apache.zookeeper.server.NIOServerCnxnFactory)
[2017-02-20 08:26:48,766] INFO Client attempting to establish new session at /127.0.0.1:45806 (org.apache.zookeeper.server.ZooKeeperServer)
[2017-02-20 08:26:48,768] INFO Established session 0x15a49f1f2530024 with negotiated timeout 30000 for client /127.0.0.1:45806 (org.apache.zookeeper.server.ZooKeeperServer)
[2017-02-20 08:26:48,778] INFO Accepted socket connection from /127.0.0.1:45808 (org.apache.zookeeper.server.NIOServerCnxnFactory)
[2017-02-20 08:26:48,778] INFO Client attempting to establish new session at /127.0.0.1:45808 (org.apache.zookeeper.server.ZooKeeperServer)
[2017-02-20 08:26:48,779] INFO Established session 0x15a49f1f2530025 with negotiated timeout 30000 for client /127.0.0.1:45808 (org.apache.zookeeper.server.ZooKeeperServer)
[2017-02-20 08:26:48,783] INFO Accepted socket connection from /127.0.0.1:45810 (org.apache.zookeeper.server.NIOServerCnxnFactory)
[2017-02-20 08:26:48,783] INFO Client attempting to establish new session at /127.0.0.1:45810 (org.apache.zookeeper.server.ZooKeeperServer)
[2017-02-20 08:26:48,784] INFO Established session 0x15a49f1f2530026 with negotiated timeout 30000 for client /127.0.0.1:45810 (org.apache.zookeeper.server.ZooKeeperServer)
[2017-02-20 08:26:48,865] INFO Accepted socket connection from /127.0.0.1:45812 (org.apache.zookeeper.server.NIOServerCnxnFactory)
[2017-02-20 08:26:48,865] INFO Client attempting to establish new session at /127.0.0.1:45812 (org.apache.zookeeper.server.ZooKeeperServer)
[2017-02-20 08:26:48,867] INFO Established session 0x15a49f1f2530027 with negotiated timeout 6000 for client /127.0.0.1:45812 (org.apache.zookeeper.server.ZooKeeperServer)
[2017-02-20 08:26:48,930] INFO Got user-level KeeperException when processing sessionid:0x15a49f1f2530027 type:create cxid:0x2 zxid:0xa9 txntype:-1 reqpath:n/a Error Path:/consumers Error:KeeperErrorCode = NodeExists for /consumers (org.apache.zookeeper.server.PrepRequestProcessor)
[2017-02-20 08:27:50,978] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.GroupMetadataManager)

unable to start kafka server/broker

when starting the kafka broker i am getting some error:
I am giving last few lines of the error log:
INFO zookeeper state changed (SyncConnected) (org.I0Itec.zkclient.ZkClient)
[2016-05-12 01:07:01,759] INFO Log directory '/var/logs/service-bridge-logs' not found, creating it. (kafka.log.LogManager)
[2016-05-12 01:07:01,778] INFO Loading logs. (kafka.log.LogManager)
[2016-05-12 01:07:01,796] INFO Logs loading complete. (kafka.log.LogManager)
[2016-05-12 01:07:01,797] INFO Starting log cleanup with a period of 300000 ms. (kafka.log.LogManager)
[2016-05-12 01:07:01,806] INFO Starting log flusher with a default period of 9223372036854775807 ms. (kafka.log.LogManager)
[2016-05-12 01:07:01,874] INFO Awaiting socket connections on gns3-d.cloudapp.net:9092. (kafka.network.Acceptor)
[2016-05-12 01:07:01,875] INFO [Socket Server on Broker 2], Started (kafka.network.SocketServer)
[2016-05-12 01:07:02,042] INFO Will not load MX4J, mx4j-tools.jar is not in the classpath (kafka.utils.Mx4jLoader$)
[2016-05-12 01:07:02,168] INFO 2 successfully elected as leader (kafka.server.ZookeeperLeaderElector)
[2016-05-12 01:07:02,386] INFO Registered broker 2 at path /brokers/ids/2 with address 10.1.0.4:9092. (kafka.utils.ZkUtils$)
[2016-05-12 01:07:02,416] INFO [Kafka Server 2], started (kafka.server.KafkaServer)
[2016-05-12 01:07:02,529] INFO New leader is 2 (kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
[2016-05-12 01:07:25,798] ERROR Closing socket for /40.122.64.23 because of error (kafka.network.Processor)
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:197)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at kafka.utils.Utils$.read(Utils.scala:380)
at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
at kafka.network.Processor.read(SocketServer.scala:444)
at kafka.network.Processor.run(SocketServer.scala:340)
at java.lang.Thread.run(Thread.java:745)
while in zookeeper sidealso am getting few error:
INFO Established session 0x154a35b64f40000 with negotiated timeout 6000 for client /10.1.0.4:36673 (org.apache.zookeeper.server.ZooKeeperServer)
[2016-05-12 01:07:02,313] INFO Got user-level KeeperException when processing sessionid:0x154a35b64f40000 type:delete cxid:0x1d zxid:0x52 txntype:-1 reqpath:n/a Error Path:/admin/preferred_replica_election Error:KeeperErrorCode = NoNode for /admin/preferred_replica_election (org.apache.zookeeper.server.PrepRequestProcessor)
[2016-05-12 01:08:33,001] INFO Expiring session 0x154a35b64f40000, timeout of 6000ms exceeded (org.apache.zookeeper.server.ZooKeeperServer)
[2016-05-12 01:08:33,001] INFO Processed session termination for sessionid: 0x154a35b64f40000 (org.apache.zookeeper.server.PrepRequestProcessor)
[2016-05-12 01:08:33,017] INFO Closed socket connection for client /10.1.0.4:36673 which had sessionid 0x154a35b64f40000 (org.apache.zookeeper.server.NIOServerCnxn)
Any idea guys??
Thanks in advance..
As user avr pointed out, this is a known bug in Kafka 0.8.2.x
These are routine informational messages misclassified as ERROR.
As of Kafka v0.9.0.0, the loglevel has been corrected to WARN.