Zookeeper using high CPU - apache-kafka

I am new to zookeeper and I have a single node zookeeper setup on a system having 2 cores and 16 GB RAM. The number of active connections on zookeeper is near about 700.And it is taking 2 GB of RAM and 100% CPU(on both the cores). I am unable to find out what is causing this high CPU Utilization
My Zookeeper Config is:
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# the directory where the snapshot is stored.
dataDir=/path
# the port at which the clients will connect
clientPort=2181
# disable the per-ip limit on the number of connections since this is a non-production config
maxClientCnxns=0
and the zookeeper logs are:
[2016-12-30 08:14:47,057] INFO Client attempting to establish new session at /10.179.63.152:43480 (org.apache.zookeeper.server.ZooKeeperServer)
[2016-12-30 08:14:47,107] INFO Established session 0x1594ae64cde0188 with negotiated timeout 10000 for client /xx.xxx.xx.xxx:43480 (org.apache.zookeeper.server.ZooKeeperServer)
[2016-12-30 08:14:48,099] INFO Got user-level KeeperException when processing sessionid:0x1594ae64cde0188 type:create cxid:0x2 zxid:0xa798c81b txntype:-1 reqpath:n/a Error Path:/consumers Error:KeeperErrorCode = NodeExists for /consumers (org.apache.zookeeper.server.PrepRequestProcessor)
[2016-12-30 08:14:48,100] INFO Got user-level KeeperException when processing sessionid:0x1594ae64cde0188 type:create cxid:0x3 zxid:0xa798c81c txntype:-1 reqpath:n/a Error Path:/consumers/prod Error:KeeperErrorCode = NodeExists for /consumers/prod (org.apache.zookeeper.server.PrepRequestProcessor)
[2016-12-30 08:14:48,101] INFO Got user-level KeeperException when processing sessionid:0x1594ae64cde0188 type:create cxid:0x4 zxid:0xa798c81d txntype:-1 reqpath:n/a Error Path:/consumers/prod/ids Error:KeeperErrorCode = NodeExists for /consumers/prod/ids (org.apache.zookeeper.server.PrepRequestProcessor)
[2016-12-30 08:15:25,273] INFO Accepted socket connection from /10.185.3.226:58898 (org.apache.zookeeper.server.NIOServerCnxnFactory)
[2016-12-30 08:15:25,284] INFO Client attempting to establish new session at /xx.xxx.xx.xxx:58898 (org.apache.zookeeper.server.ZooKeeperServer)
[2016-12-30 08:15:25,347] INFO Established session 0x1594ae64cde0189 with negotiated timeout 10000 for client /xx.xxx.xx.xxx:58898 (org.apache.zookeeper.server.ZooKeeperServer)
[2016-12-30 08:15:26,762] INFO Got user-level KeeperException when processing sessionid:0x1594ae64cde0189 type:create cxid:0x2 zxid:0xa798cb43 txntype:-1 reqpath:n/a Error Path:/consumers Error:KeeperErrorCode = NodeExists for /consumers (org.apache.zookeeper.server.PrepRequestProcessor)
[2016-12-30 08:15:26,762] INFO Got user-level KeeperException when processing sessionid:0x1594ae64cde0189 type:create cxid:0x3 zxid:0xa798cb44 txntype:-1 reqpath:n/a Error Path:/consumers/prod Error:KeeperErrorCode = NodeExists for /consumers/prod (org.apache.zookeeper.server.PrepRequestProcessor)
[2016-12-30 08:15:26,763] INFO Got user-level KeeperException when processing sessionid:0x1594ae64cde0189 type:create cxid:0x4 zxid:0xa798cb45 txntype:-1 reqpath:n/a Error Path:/consumers/prod/ids Error:KeeperErrorCode = NodeExists for /consumers/prod/ids (org.apache.zookeeper.server.PrepRequestProcessor)
[2016-12-30 08:15:56,786] INFO Accepted socket connection from /xx.xxx.xx.xxx:43574 (org.apache.zookeeper.server.NIOServerCnxnFactory)
[2016-12-30 08:15:56,790] INFO Client attempting to establish new session at /xx.xxx.xx.xxx:43574 (org.apache.zookeeper.server.ZooKeeperServer)
[2016-12-30 08:15:56,843] INFO Established session 0x1594ae64cde018a with negotiated timeout 10000 for client /10.185.178.18:43574 (org.apache.zookeeper.server.ZooKeeperServer)
[2016-12-30 08:15:57,743] INFO Got user-level KeeperException when processing sessionid:0x1594ae64cde018a type:create cxid:0x2 zxid:0xa798db9d txntype:-1 reqpath:n/a Error Path:/consumers Error:KeeperErrorCode = NodeExists for /consumers (org.apache.zookeeper.server.PrepRequestProcessor)
[2016-12-30 08:15:57,743] INFO Got user-level KeeperException when processing sessionid:0x1594ae64cde018a type:create cxid:0x3 zxid:0xa798db9e txntype:-1 reqpath:n/a Error Path:/consumers/prod Error:KeeperErrorCode = NodeExists for /consumers/prod (org.apache.zookeeper.server.PrepRequestProcessor)
[2016-12-30 08:15:57,744] INFO Got user-level KeeperException when processing sessionid:0x1594ae64cde018a type:create cxid:0x4 zxid:0xa798db9f txntype:-1 reqpath:n/a Error Path:/consumers/prod/ids Error:KeeperErrorCode = NodeExists for /consumers/prod/ids (org.apache.zookeeper.server.PrepRequestProcessor)
Can someone help me identify the problem .

Related

Zookeeper + BadVersion for /brokers/topics/topic_name/partitions/6/state

from zookeeper log we can see huge lines with the same error as
BadVersion for /brokers/topics/topic_name/partitions/6/state
example from the zookeeper log:
2022-03-04 03:09:23,503 [myid:3] - INFO [ProcessThread(sid:3 cport:-1)::PrepRequestProcessor#643] - Got user-level KeeperException when processing sessionid:0x27f4d8506b21199 type:setData cxid:0x25483f7 zxid:0x280109b155 txntype:-1 reqpath:n/a Error Path:/brokers/topics/my_first_car/partitions/6/state Error:KeeperErrorCode = BadVersion for /brokers/topics/my_first_car/partitions/6/state
any idea what this errors means?
other similar posts - https://zookeeper.apache.org/doc/r3.2.2/api/org/apache/zookeeper/ZooKeeper.html
Distributed state-machine's zookeeper ensemble fails while processing parallel regions with error KeeperErrorCode = BadVersion

Kafka Zookeeper Random Restarts

We are running Hyperledger fabric network with Kafka and zookeeper in production using docker swarm on Azure VM (4 Kafka node, 3 zookeeper nodes) it was running fine but just 2 days back suddenly zookeeper had a restart, after that there's continuous restart on zookeeper having time interval of 6-8 hours.
logs on Kafka node
[2020-07-04 07:48:53,492] INFO [ReplicaFetcher replicaId=2, leaderId=1, fetcherId=0] Stopped (kafka.server.ReplicaFetcherThread)
[2020-07-04 07:48:53,492] INFO [ReplicaFetcher replicaId=2, leaderId=1, fetcherId=0] Shutdown completed (kafka.server.ReplicaFetcherThread)
[2020-07-04 07:48:53,499] INFO [ReplicaFetcherManager on broker 2] Removed fetcher for partitions xxxx-xxxxx-xxx-xxxxx.
zookeeper leader logs
2020-07-04 07:46:27,070 [myid:3] - INFO [ProcessThread(sid:3 cport:-1)::PrepRequestProcessor#653] - Got user-level KeeperException when processing sessionid:0x10101beb22c0000 type:create cxid:0x4 zxid:0x2e00000114 txntype:-1 reqpath:n/a Error Path:/brokers/ids Error:KeeperErrorCode = NodeExists for /brokers/ids
2020-07-04 07:48:43,084 [myid:3] - INFO [SessionTracker:ZooKeeperServer#355] - Expiring session 0x2010551ef290000, timeout of 6000ms exceeded
2020-07-04 07:48:43,085 [myid:3] - INFO [ProcessThread(sid:3 cport:-1)::PrepRequestProcessor#487] - Processed session termination for sessionid: 0x2010551ef290000
2020-07-04 07:48:43,091 [myid:3] - INFO [CommitProcessor:3:NIOServerCnxn#1056] - Closed socket connection for client /100.0.20.80:60672 which had sessionid 0x2010551ef290000
2020-07-04 07:48:55,182 [myid:3] - ERROR [LearnerHandler-/100.0.20.80:58940:LearnerHandler#648] - Unexpected exception causing shutdown while sock still open
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:85)
at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99)
at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:559)
2020-07-04 07:48:55,183 [myid:3] - WARN [LearnerHandler-/100.0.20.80:58940:LearnerHandler#661] - ******* GOODBYE /100.0.20.80:58940 ********
2020-07-04 07:49:57,623 [myid:3] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory#222] - Accepted socket connection from /100.0.20.80:37838
2020-07-04 07:49:57,637 [myid:3] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer#949] - Client attempting to establish new session at /100.0.20.80:37838
2020-07-04 07:49:57,641 [myid:3] - INFO [CommitProcessor:3:ZooKeeperServer#694] - Established session 0x300ed4720900000 with negotiated timeout 12000 for client /100.0.20.80:37838
2020-07-04 07:49:57,670 [myid:3] - INFO [ProcessThread(sid:3 cport:-1)::PrepRequestProcessor#653] - Got user-level KeeperException when processing sessionid:0x300ed4720900000 type:setData cxid:0x1 zxid:0x2e000003b2 txntype:-1 reqpath:n/a Error Path:/brokers/topics/xxxxxxxxxxxx/partitions/0/state Error:KeeperErrorCode = BadVersion for /brokers/topics/xxxxxxxxxxxx/partitions/0/state
my zoo.cfg
clientPort=2181
dataDir=/data
dataLogDir=/datalog
tickTime=6000
initLimit=10
syncLimit=2
autopurge.snapRetainCount=3
autopurge.purgeInterval=1
server.1=xxx.xxx.com:2888:3888
server.2=xxx.xxx.com:2888:3888
server.3=0.0.0.0:2888:3888

Error when starting Kafka server in Ubuntu linux

I have been trying to setup Kafka in Ubuntu according to the quickstart guide, but I am having an issue with the Kafka server. I can start Zookeeper without problems, the problem comes when starting the server with the following command:
bin/kafka-server-start.sh config/server.properties
In the terminal of the server I get the following error:
[2019-07-09 21:30:24,997] WARN [Controller id=0, targetBrokerId=0] Error connecting to node 0.0.0.18:9092 (id: 0 rack: null) (org.apache.kafka.clients.NetworkClient)
java.net.SocketException: Invalid argument
at sun.nio.ch.Net.connect0(Native Method)
at sun.nio.ch.Net.connect(Net.java:454)
at sun.nio.ch.Net.connect(Net.java:446)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648)
at org.apache.kafka.common.network.Selector.doConnect(Selector.java:278)
at org.apache.kafka.common.network.Selector.connect(Selector.java:256)
at org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:920)
at org.apache.kafka.clients.NetworkClient.ready(NetworkClient.java:287)
at org.apache.kafka.clients.NetworkClientUtils.awaitReady(NetworkClientUtils.java:65)
at kafka.controller.RequestSendThread.brokerReady(ControllerChannelManager.scala:282)
at kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:236)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82)
[2019-07-09 21:30:25,026] INFO [ExpirationReaper-0-ElectPreferredLeader]: Stopped (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
[2019-07-09 21:30:25,026] INFO [ExpirationReaper-0-ElectPreferredLeader]: Shutdown completed (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
[2019-07-09 21:30:25,029] INFO [ReplicaManager broker=0] Shut down completely (kafka.server.ReplicaManager)
In the Zookeeper terminal I get the following:
INFO Got user-level KeeperException when processing sessionid:0x100004ba9750000 type:create cxid:0x2 zxid:0x3 txntype:-1 reqpath:n/a Error Path:/brokers Error:KeeperErrorCode = NoNode for /brokers (org.apache.zookeeper.server.PrepRequestProcessor)
[2019-07-09 20:46:41,080] INFO Got user-level KeeperException when processing sessionid:0x100004ba9750000 type:create cxid:0x6 zxid:0x7 txntype:-1 reqpath:n/a Error Path:/config Error:KeeperErrorCode = NoNode for /config (org.apache.zookeeper.server.PrepRequestProcessor)
[2019-07-09 20:46:41,087] INFO Got user-level KeeperException when processing sessionid:0x100004ba9750000 type:create cxid:0x9 zxid:0xa txntype:-1 reqpath:n/a Error Path:/admin Error:KeeperErrorCode = NoNode for /admin (org.apache.zookeeper.server.PrepRequestProcessor)
[2019-07-09 20:46:41,233] INFO Got user-level KeeperException when processing sessionid:0x100004ba9750000 type:create cxid:0x15 zxid:0x15 txntype:-1 reqpath:n/a Error Path:/cluster Error:KeeperErrorCode = NoNode for /cluster (org.apache.zookeeper.server.PrepRequestProcessor)
[2019-07-09 20:46:41,803] INFO Got user-level KeeperException when processing sessionid:0x100004ba9750000 type:multi cxid:0x32 zxid:0x1c txntype:-1 reqpath:n/a aborting remaining multi ops. Error Path:/admin/preferred_replica_election Error:KeeperErrorCode = NoNode for /admin/preferred_replica_election (org.apache.zookeeper.server.PrepRequestProcessor)
[2019-07-09 21:29:00,344] WARN Unable to read additional data from client sessionid 0x100004ba9750000, likely client has closed socket (org.apache.zookeeper.server.NIOServerCnxn)
[2019-07-09 21:29:00,348] INFO Closed socket connection for client /127.0.0.1:39738 which had sessionid 0x100004ba9750000 (org.apache.zookeeper.server.NIOServerCnxn)
What could be the issue?

kafka 0.10.1.1 stops responding sporadically

We use kafka 0.10.1.1 and it is running fine for few hours and sometimes few days. All of sudden it starts giving the below exception and broker loses connection between them. The zookeeper and kafka server processes are running, but not accepting any connection.
We are running kafka and zookeeper on the same node and its a 2 nodes cluster setup. This is just our dev environment.
The below exception was observed in the server log.
2017-03-23 14:05:52,729] WARN [ReplicaFetcherThread-0-38], Error in fetch kafka.server.ReplicaFetcherThread$FetchRequest#1893e027 (kafka.server.ReplicaFetcherThread)
java.io.IOException: Connection to 2 was disconnected before the response was read
at kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:115)
at kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:112)
at scala.Option.foreach(Option.scala:257)
at kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:112)
at kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:108)
at kafka.utils.NetworkClientBlockingOps$.recursivePoll$1(NetworkClientBlockingOps.scala:137)
at kafka.utils.NetworkClientBlockingOps$.kafka$utils$NetworkClientBlockingOps$$pollContinuously$extension(NetworkClientBlockingOps.scala:143)
at kafka.utils.NetworkClientBlockingOps$.blockingSendAndReceive$extension(NetworkClientBlockingOps.scala:108)
at kafka.server.ReplicaFetcherThread.sendRequest(ReplicaFetcherThread.scala:253)
at kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:238)
at kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:42)
at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:118)
at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:103)
From Zookeeper.log,
[2017-03-22 22:42:15,600] INFO Client attempting to establish new session at /10.141.202.141:59930 (org.apache.zookeeper.server.ZooKeeperServer)
[2017-03-22 22:42:15,603] INFO Established session 0x25af82f142c0000 with negotiated timeout 6000 for client /11.121.102.441:59930 (org.apache.zookeeper.server.ZooKeeperServer)
[2017-03-22 22:42:29,322] INFO Got user-level KeeperException when processing sessionid:0x25af82f142c0000 type:create cxid:0x3a8 zxid:0x1e0000001a txntype:-1 reqpath:n/a Error Path:/brokers Error:KeeperErrorCode = NodeExists for /brokers (org.apache.zookeeper.server.PrepRequestProcessor)
[2017-03-22 22:42:29,329] INFO Got user-level KeeperException when processing sessionid:0x25af82f142c0000 type:create cxid:0x3a9 zxid:0x1e0000001b txntype:-1 reqpath:n/a Error Path:/brokers/ids Error:KeeperErrorCode = NodeExists for /brokers/ids (org.apache.zookeeper.server.PrepRequestProcessor)
[2017-03-22 22:42:32,943] INFO Accepted socket connection from /17.150.218.7:58233 (org.apache.zookeeper.server.NIOServerCnxnFactory)
[2017-03-22 22:42:32,944] WARN Connection request from old client /17.150.218.7:58233; will be dropped if server is in r-o mode (org.apache.zookeeper.server.ZooKeeperServer)
[2017-03-22 22:42:32,944] INFO Client attempting to establish new session at /17.150.218.7:58233 (org.apache.zookeeper.server.ZooKeeperServer)
[2017-03-22 22:42:32,947] INFO Established session 0x25af82f142c0001 with negotiated timeout 30000 for client /17.121.102.241:58233 (org.apache.zookeeper.server.ZooKeeperServer)
[2017-03-22 22:42:33,402] INFO Processed session termination for sessionid: 0x25af82f142c0001 (org.apache.zookeeper.server.PrepRequestProcessor)
[2017-03-22 22:42:33,405] INFO Closed socket connection for client /17.150.218.7:58233 which had sessionid 0x25af82f142c0001 (org.apache.zookeeper.server.NIOServerCnxn)
Thanks

unable to start kafka server/broker

when starting the kafka broker i am getting some error:
I am giving last few lines of the error log:
INFO zookeeper state changed (SyncConnected) (org.I0Itec.zkclient.ZkClient)
[2016-05-12 01:07:01,759] INFO Log directory '/var/logs/service-bridge-logs' not found, creating it. (kafka.log.LogManager)
[2016-05-12 01:07:01,778] INFO Loading logs. (kafka.log.LogManager)
[2016-05-12 01:07:01,796] INFO Logs loading complete. (kafka.log.LogManager)
[2016-05-12 01:07:01,797] INFO Starting log cleanup with a period of 300000 ms. (kafka.log.LogManager)
[2016-05-12 01:07:01,806] INFO Starting log flusher with a default period of 9223372036854775807 ms. (kafka.log.LogManager)
[2016-05-12 01:07:01,874] INFO Awaiting socket connections on gns3-d.cloudapp.net:9092. (kafka.network.Acceptor)
[2016-05-12 01:07:01,875] INFO [Socket Server on Broker 2], Started (kafka.network.SocketServer)
[2016-05-12 01:07:02,042] INFO Will not load MX4J, mx4j-tools.jar is not in the classpath (kafka.utils.Mx4jLoader$)
[2016-05-12 01:07:02,168] INFO 2 successfully elected as leader (kafka.server.ZookeeperLeaderElector)
[2016-05-12 01:07:02,386] INFO Registered broker 2 at path /brokers/ids/2 with address 10.1.0.4:9092. (kafka.utils.ZkUtils$)
[2016-05-12 01:07:02,416] INFO [Kafka Server 2], started (kafka.server.KafkaServer)
[2016-05-12 01:07:02,529] INFO New leader is 2 (kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
[2016-05-12 01:07:25,798] ERROR Closing socket for /40.122.64.23 because of error (kafka.network.Processor)
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
at sun.nio.ch.IOUtil.read(IOUtil.java:197)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
at kafka.utils.Utils$.read(Utils.scala:380)
at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
at kafka.network.Processor.read(SocketServer.scala:444)
at kafka.network.Processor.run(SocketServer.scala:340)
at java.lang.Thread.run(Thread.java:745)
while in zookeeper sidealso am getting few error:
INFO Established session 0x154a35b64f40000 with negotiated timeout 6000 for client /10.1.0.4:36673 (org.apache.zookeeper.server.ZooKeeperServer)
[2016-05-12 01:07:02,313] INFO Got user-level KeeperException when processing sessionid:0x154a35b64f40000 type:delete cxid:0x1d zxid:0x52 txntype:-1 reqpath:n/a Error Path:/admin/preferred_replica_election Error:KeeperErrorCode = NoNode for /admin/preferred_replica_election (org.apache.zookeeper.server.PrepRequestProcessor)
[2016-05-12 01:08:33,001] INFO Expiring session 0x154a35b64f40000, timeout of 6000ms exceeded (org.apache.zookeeper.server.ZooKeeperServer)
[2016-05-12 01:08:33,001] INFO Processed session termination for sessionid: 0x154a35b64f40000 (org.apache.zookeeper.server.PrepRequestProcessor)
[2016-05-12 01:08:33,017] INFO Closed socket connection for client /10.1.0.4:36673 which had sessionid 0x154a35b64f40000 (org.apache.zookeeper.server.NIOServerCnxn)
Any idea guys??
Thanks in advance..
As user avr pointed out, this is a known bug in Kafka 0.8.2.x
These are routine informational messages misclassified as ERROR.
As of Kafka v0.9.0.0, the loglevel has been corrected to WARN.