I would appreciate your help on this.
I am building a Apache Kafka consumer to subscribe to another already running Kafka. Now, my problem is that when my producer pushes message to server...my consumer does not receive them .. and I get the below info in my logs printed::
13/08/30 18:00:58 INFO producer.SyncProducer: Connected to xx.xx.xx.xx:6667:false for producing
13/08/30 18:00:58 INFO producer.SyncProducer: Disconnecting from xx.xx.xx.xx:6667:false
13/08/30 18:00:58 INFO consumer.ConsumerFetcherManager: [ConsumerFetcherManager- 1377910855898] Stopping leader finder thread
13/08/30 18:00:58 INFO consumer.ConsumerFetcherManager: [ConsumerFetcherManager- 1377910855898] Stopping all fetchers
13/08/30 18:00:58 INFO consumer.ConsumerFetcherManager: [ConsumerFetcherManager- 1377910855898] All connections stopped
I am not sure if I am missing any important configuration here...However, I can see some messages coming from my server using WireShark but they are not getting consumed by my consumer....
My code is the exact replica of the sample consumer example:
https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example
UPDATE:
[2013-09-03 00:57:30,146] INFO Starting ZkClient event thread.
(org.I0Itec.zkclient.ZkEventThread)
[2013-09-03 00:57:30,146] INFO Opening socket connection to server /xx.xx.xx.xx:2181 (org.apache.zookeeper.ClientCnxn)
[2013-09-03 00:57:30,235] INFO Connected to xx.xx.xx:6667 for producing (kafka.producer.SyncProducer)
[2013-09-03 00:57:30,299] INFO Socket connection established to 10.224.62.212/10.224.62.212:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2013-09-03 00:57:30,399] INFO Disconnecting from xx.xx.xx.net:6667 (kafka.producer.SyncProducer)
[2013-09-03 00:57:30,400] INFO [ConsumerFetcherManager-1378195030845] Stopping leader finder thread (kafka.consumer.ConsumerFetcherManager)
[2013-09-03 00:57:30,400] INFO [ConsumerFetcherManager-1378195030845] Stopping all fetchers (kafka.consumer.ConsumerFetcherManager)
[2013-09-03 00:57:30,400] INFO [ConsumerFetcherManager-1378195030845] All connections stopped (kafka.consumer.ConsumerFetcherManager)
[2013-09-03 00:57:30,400] INFO [console-consumer-49997_xx.xx.xx-1378195030443-cce6fc51], Cleared all relevant queues for this fetcher (kafka.consumer.ZookeeperConsumerConnector)
[2013-09-03 00:57:30,400] INFO [console-consumer-49997_xx.xx.xx.-1378195030443-cce6fc51], Cleared the data chunks in all the consumer message iterators (kafka.consumer.ZookeeperConsumerConnector)
[2013-09-03 00:57:30,400] INFO [console-consumer-49997_xx.xx.xx.xx-1378195030443-cce6fc51], Committing all offsets after clearing the fetcher queues (kafka.consumer.ZookeeperConsumerConnector)
[2013-09-03 00:57:30,401] ERROR [console-consumer-49997_xx.xx.xx.xx-1378195030443-cce6fc51], zk client is null. Cannot commit offsets (kafka.consumer.ZookeeperConsumerConnector)
[2013-09-03 00:57:30,401] INFO [console-consumer-49997_xx.xx.xx.xx-1378195030443-cce6fc51], Releasing partition ownership (kafka.consumer.ZookeeperConsumerConnector)
[2013-09-03 00:57:30,401] INFO [console-consumer-49997_xx.xx.xx.xx-1378195030443-cce6fc51], exception during rebalance (kafka.consumer.ZookeeperConsumerConnector)
java.util.NoSuchElementException: None.get
at scala.None$.get(Option.scala:185)
at scala.None$.get(Option.scala:183)
at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalance$2.apply(ZookeeperConsumerConnector.scala:434)
at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalance$2.apply(ZookeeperConsumerConnector.scala:429)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:80)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:80)
at scala.collection.Iterator$class.foreach(Iterator.scala:631)
at scala.collection.mutable.HashTable$$anon$1.foreach(HashTable.scala:161)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:194)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:80)
at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalance(ZookeeperConsumerConnector.scala:429)
at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$syncedRebalance$1.apply$mcVI$sp(ZookeeperConsumerConnector.scala:374)
at scala.collection.immutable.Range$ByOne$class.foreach$mVc$sp(Range.scala:282)
at scala.collection.immutable.Range$$anon$2.foreach$mVc$sp(Range.scala:265)
at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:369)
at kafka.consumer.ZookeeperConsumerConnector.kafka$consumer$ZookeeperConsumerConnector$$reinitializeConsumer(ZookeeperConsumerConnector.scala:681)
at kafka.consumer.ZookeeperConsumerConnector$WildcardStreamsHandler.<init>(ZookeeperConsumerConnector.scala:715)
at kafka.consumer.ZookeeperConsumerConnector.createMessageStreamsByFilter(ZookeeperConsumerConnector.scala:140)
at kafka.consumer.ConsoleConsumer$.main(ConsoleConsumer.scala:196)
at kafka.consumer.ConsoleConsumer.main(ConsoleConsumer.scala)
Can you please provide your producer code sample?
Do you have the latest 0.8 version checked out? It appears that there has been some known issue with consumerFetched deadlock which has been patched and fixed in the current version
you can try to use the admin console script to consume messages making sure your producer is working fine.
If possible post some more logs and code snippet, should help debugging further
(it seems I need more reputation to make a comment so had to answer instead)
Related
I have an apache kafka streams application . I notice that it sometimes shutsdown when a rebalancing occurs with no real reason for the shutdown . It doesn't even throw an exception.
Here are some logs on the same
[2022-03-08 17:13:37,024] INFO [Consumer clientId=svc-stream-collector-StreamThread-1-consumer, groupId=svc-stream-collector] Adding newly assigned partitions: (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2022-03-08 17:13:37,024] ERROR stream-thread [svc-stream-collector-StreamThread-1] A Kafka Streams client in this Kafka Streams application is requesting to shutdown the application (org.apache.kafka.streams.processor.internals.StreamThread)
[2022-03-08 17:13:37,030] INFO stream-client [svc-stream-collector] State transition from REBALANCING to PENDING_ERROR (org.apache.kafka.streams.KafkaStreams)
old state:REBALANCING new state:PENDING_ERROR
[2022-03-08 17:13:37,031] INFO [Consumer clientId=svc-stream-collector-StreamThread-1-consumer, groupId=svc-stream-collector] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
[2022-03-08 17:13:37,032] INFO stream-thread [svc-stream-collector-StreamThread-1] Informed to shut down (org.apache.kafka.streams.processor.internals.StreamThread)
[2022-03-08 17:13:37,032] INFO stream-thread [svc-stream-collector-StreamThread-1] State transition from PARTITIONS_REVOKED to PENDING_SHUTDOWN (org.apache.kafka.streams.processor.internals.StreamThread)
[2022-03-08 17:13:37,067] INFO stream-thread [svc-stream-collector-StreamThread-1] Thread state is already PENDING_SHUTDOWN, skipping the run once call after poll request (org.apache.kafka.streams.processor.internals.StreamThread)
[2022-03-08 17:13:37,067] WARN stream-thread [svc-stream-collector-StreamThread-1] Detected that shutdown was requested. All clients in this app will now begin to shutdown (org.apache.kafka.streams.processor.internals.StreamThread)
I'm suspecting its because there are no newly assigned partitions in the log below
[2022-03-08 17:13:37,024] INFO [Consumer clientId=svc-stream-collector-StreamThread-1-consumer, groupId=svc-stream-collector] Adding newly assigned partitions: (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
However I'm not exactly sure why this error occurs . Any help would be appreciated.
I am a beginner and I have to use Kafka for data transfer into/from Hadoop FS (or any other application, not just through put or copyFromLocal commands),kafka needs zookeeper as well, I enabled Zooekeeper audit logging but I still get errors.
For Kafka, when I want to start it:
JMX_PORT=8004 bin/kafka-server-start.sh config/server.properties
I get the error:
[2022-02-16 13:56:45,939] INFO shutting down (kafka.server.KafkaServer)
[2022-02-16 13:56:46,114] INFO App info kafka.server for 0 unregistered (org.apache.kafka.common.utils.AppInfoParser)
[2022-02-16 13:56:46,133] INFO shut down completed (kafka.server.KafkaServer)
[2022-02-16 13:56:46,133] ERROR Exiting Kafka. (kafka.Kafka$)
[2022-02-16 13:56:46,165] INFO shutting down (kafka.server.KafkaServer)
And when I want to start Zookeeper using the command:
bin/zookeeper-server-start.sh config/zookeeper.properties
I get the following (and it gets stuck on it):
[2022-02-16 14:03:13,954] INFO zookeeper.request_throttler.shutdownTimeout = 10000 (org.apache.zookeeper.server.RequestThrottler)
[2022-02-16 14:03:13,955] INFO PrepRequestProcessor (sid:0) started, reconfigEnabled=false (org.apache.zookeeper.server.PrepRequestProcessor)
[2022-02-16 14:03:14,136] INFO Using checkIntervalMs=60000 maxPerMinute=10000 maxNeverUsedIntervalMs=0 (org.apache.zookeeper.server.ContainerManager)
[2022-02-16 14:03:14,138] INFO ZooKeeper audit is disabled. (org.apache.zookeeper.audit.ZKAuditProvider)
Does anyone know how to work this out? I enabled audit logging but still. Same problem.
Zookeeper CLI isn't "stuck"; it's waiting for connections.
Open a new terminal and start Kafka
Alternatively, you could use Docker Compose / Kubernetes, if you think your host / local JVM is causing issues
Apache Kafka is failing to start. It shows in its logs "Failed to get cluster id from Zookeeper. This can happen if /cluster/id is deleted from Zookeeper."
How can I check the "/cluster/id"?
previously, when kafka is failing to start it is because I updated java on my server. So, I need to re-direct $JAVA_HOME to the new path & restart kafka and it will work again just fine. But, this case is different, I checked $JAVA_HOME & it's correct. So, I wanted to know more details of this issue. I read one of the logs file in /opt/kafka/logs and I got this log:
[2019-08-21 10:18:40,472] ERROR Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
org.apache.kafka.common.KafkaException: Failed to get cluster id from Zookeeper. This can happen if /cluster/id is deleted from Zookeeper.
at kafka.zk.KafkaZkClient.$anonfun$createOrGetClusterId$1(KafkaZkClient.scala:1498)
at scala.Option.getOrElse(Option.scala:138)
at kafka.zk.KafkaZkClient.createOrGetClusterId(KafkaZkClient.scala:1498)
at kafka.server.KafkaServer.$anonfun$getOrGenerateClusterId$1(KafkaServer.scala:390)
at scala.Option.getOrElse(Option.scala:138)
at kafka.server.KafkaServer.getOrGenerateClusterId(KafkaServer.scala:390)
at kafka.server.KafkaServer.startup(KafkaServer.scala:208)
at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:38)
at kafka.Kafka$.main(Kafka.scala:75)
at kafka.Kafka.main(Kafka.scala)
[2019-08-21 10:18:40,476] INFO shutting down (kafka.server.KafkaServer)
[2019-08-21 10:18:40,488] INFO [ZooKeeperClient] Closing. (kafka.zookeeper.ZooKeeperClient)
[2019-08-21 10:18:40,500] INFO Session: 0x100494a8714000a closed (org.apache.zookeeper.ZooKeeper)
[2019-08-21 10:18:40,505] INFO EventThread shut down for session: 0x100494a8714000a (org.apache.zookeeper.ClientCnxn)
[2019-08-21 10:18:40,507] INFO [ZooKeeperClient] Closed. (kafka.zookeeper.ZooKeeperClient)
[2019-08-21 10:18:40,517] INFO shut down completed (kafka.server.KafkaServer)
[2019-08-21 10:18:40,517] ERROR Exiting Kafka. (kafka.server.KafkaServerStartable)
[2019-08-21 10:18:40,554] INFO shutting down (kafka.server.KafkaServer)
when I read the previous Error
Failed to get cluster id from Zookeeper. This can happen if /cluster/id is deleted from Zookeeper.
I though that "zookeeper id" in this path was deleted by mistake /tmp/zookeeper/myid but the file is still there with the corresponding server number written in it & zk is working fine. my kafka version is 2.2.0 .
I searched online for this ERROR "Failed to get cluster id from Zookeeper. This can happen if /cluster/id is deleted from Zookeeper."
but honestly, I did not find any thing helpful.
I have 2 broker cluster of kafka 0.10.1 running up previously on my development servers with zookeeper 3.3.6 correctly.
I recently tried upgrading broker version to latest kafka 2.3.0 but it didn't start.
There is nothing much changed in the configuration.
Can anybody direct me what possibly could go wrong. Why brokers are not getting started?
Changed server.properties on broker server 1
broker.id=1
log.dirs=/mnt/kafka_2.11-2.3.0/logs
zookeeper.connect=local1:2181,local2:2181
listeners=PLAINTEXT://local1:9092
advertised.listeners=PLAINTEXT://local1:9092
Changed server.properties on broker server 2
broker.id=2
log.dirs=/mnt/kafka_2.11-2.3.0/logs
zookeeper.connect=local1:2181,local2:2181
listeners=PLAINTEXT://local2:9092
advertised.listeners=PLAINTEXT://local2:9092
NOTE:
1. Zookeeper is running on both servers
2. Kafka directories namely /brokers, /brokers/ids, /consumers etc are getting created.
3. Nothing is getting registered under /brokers/ids. Zookeeper CLI get /brokers/ids returns
[]
4. Command lsof -i tcp:9082 returns
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 18290 cass 118u IPv6 52133 0t0 TCP local2:9092 (LISTEN)
4. logs/server.log has no errors logged.
5. No more logs are getting appended to server.log.
Server logs
[2019-07-01 10:56:14,534] INFO Starting log flusher with a default period of 9223372036854775807 ms. (kafka.log.LogManager)
[2019-07-01 10:56:14,801] INFO Awaiting socket connections on local2:9092. (kafka.network.Acceptor)
[2019-07-01 10:56:14,829] INFO [SocketServer brokerId=1] Created data-plane acceptor and processors for endpoint : EndPoint(local2,9092,ListenerName(PLAINTEXT),PLAINTEXT) (kafka.network.SocketServer)
[2019-07-01 10:56:14,830] INFO [SocketServer brokerId=1] Started 1 acceptor threads for data-plane (kafka.network.SocketServer)
[2019-07-01 10:56:14,850] INFO [ExpirationReaper-1-Produce]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
[2019-07-01 10:56:14,851] INFO [ExpirationReaper-1-Fetch]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
[2019-07-01 10:56:14,851] INFO [ExpirationReaper-1-DeleteRecords]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
[2019-07-01 10:56:14,852] INFO [ExpirationReaper-1-ElectPreferredLeader]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper)
[2019-07-01 10:56:14,860] INFO [LogDirFailureHandler]: Starting (kafka.server.ReplicaManager$LogDirFailureHandler)
[2019-07-01 10:56:14,892] INFO Creating /brokers/ids/1 (is it secure? false) (kafka.zk.KafkaZkClient)
From the docs regarding ZooKeeper
Stable version
The current stable branch is 3.4 and the latest release of that branch is 3.4.9.
Upgrading zookeeper version to latest 3.5.5 helped and Kafka broker started correctly.
It would have been great if docs had stated the incompatibility with previous zookeeper version.
PS: Answer added to help someone struck with similar issue because of zookeeper version.
I have setup Kafka 3-node cluster and Zookeeper 3-node cluster, on separate nodes. Using Kafka I can produce and consume messages successfully and run commands like kafka-topic.sh to get topic lists and their informations from Zookeeper, but there are some errors on Kafka server.log file. The following warning appears continuously:
[2018-02-18 21:50:01,241] WARN Client session timed out, have not heard from server in 320190154ms for sessionid 0x161a94b101f0001 (org.apache.zookeeper.ClientCnxn)
[2018-02-18 21:50:01,242] INFO Client session timed out, have not heard from server in 320190154ms for sessionid 0x161a94b101f0001, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2018-02-18 21:50:01,343] INFO zookeeper state changed (Disconnected) (org.I0Itec.zkclient.ZkClient)
[2018-02-18 21:50:01,989] INFO Opening socket connection to server zookeeper3/192.168.1.206:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2018-02-18 21:50:02,008] INFO Socket connection established to zookeeper3/192.168.1.206:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2018-02-18 21:50:02,042] INFO Session establishment complete on server zookeeper3/192.168.1.206:2181, sessionid = 0x161a94b101f0001, negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn)
[2018-02-18 21:50:02,042] INFO zookeeper state changed (SyncConnected) (org.I0Itec.zkclient.ZkClient)
[2018-02-18 21:59:31,570] INFO [Group Metadata Manager on Broker 102]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
It seems the Kafka sessions in zookeeper expires periodically!
In Zookeeper logs are the following warninngs, too:
2018-02-18 18:20:06,149 [myid:1] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#368] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x161a94b101f0001, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:239)
at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:203)
at java.lang.Thread.run(Thread.java:748)
2018-02-18 18:20:06,151 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#1044] - Closed socket connection for client /192.168.1.203:43162 which had sessionid 0x161a94b101f0001
2018-02-18 18:20:06,781 [myid:1] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#368] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x161a94b101f0002, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:239)
at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:203)
at java.lang.Thread.run(Thread.java:748)
2018-02-18 18:20:06,782 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#1044] - Closed socket connection for client /192.168.1.201:45330 which had sessionid 0x161a94b101f0002
2018-02-18 18:37:29,127 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory#192] - Accepted socket connection from /192.168.1.202:52480
2018-02-18 18:37:29,139 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer#942] - Client attempting to establish new session at /192.168.1.202:52480
2018-02-18 18:37:29,143 [myid:1] - INFO [CommitProcessor:1:ZooKeeperServer#687] - Established session 0x161a94b101f0003 with negotiated timeout 30000 for client /192.168.1.202:52480
2018-02-18 18:37:29,432 [myid:1] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#1044] - Closed socket connection for client /192.168.1.202:52480 which had sessionid 0x161a94b101f0003
I think it's because zookeeper can't get heartbeat from Kafka nodes. The followings are Zookeeper zoo.cfg:
tickTime=2000
dataDir=/var/zookeeper/
clientPort=2181
initLimit=5
syncLimit=2
server.1=zookeeper1:2888:3888
server.2=zookeeper2:2888:3888
server.3=zookeeper3:2888:3888
and Kafka server.properties customized setting:
broker.id=1
listeners = PLAINTEXT://kafka1:9092
num.partitions=24
delete.topic.enable=true
default.replication.factor=2
log.dirs=/data/kafka/data
zookeeper.connect=zookeeper1:2181,zookeeper2:2181,zookeeper3:2181
log.retention.hours=168
I use the same zookeeper cluster for Hadoop HA without any problem. I think there is something wrong with the Kafka properties listeners and advertised.listeners. I read the Kafka documentation but couldn't understand their meaning.
In the host file of all OSes, hostnames such that zookeeper1 to zookeeper3 and kafka1 to kafka3 are defined and reachable through ping command. I removed the following lines from hosts:
127.0.0.1 localhost
127.0.1.1 hostname
I think it couldn't cause the problem.
Kafka version: 0.11
Zookeeper version: 3.4.10
Can anyone help?
We were facing a similar issue with Kafka. As #Soheil pointed out it was due to a Major GC running.
When a Major GC runs, then Kafka would sometimes not be able to send heartbeat to zookeeper. For us the Major GC was running almost once every 15 sec. On taking a heap dump, we realized it was due to a Metric Memory Leak in Kafka.