Apache storm kafka spout only reading from half of a topic's partitions - apache-kafka

A problem developed on our production Storm cluster that we cannot figure out or work around.
At some point it appears that the kafka spout stopped reading from half of the topic partitions. There are 40 partitions, and it is only reading from 20 of them. We cannot find any changes that we made to either the storm cluster or kafka at the time this started happening.
We changed the consumer group id and set the spout config startOffsetTime to OffsetRequest.LatestTime to try to get it reading fresh data from all partitions. It still only connects to the same 20 partitions. We've looked at the node /<topic>/<consumer_group> inside the Storm zookeeper and see only 20 partitions there.
We have verified that messages are being published to all 40 partitions.
Kafka version is 0.9.0.1,storm version is 1.1.0.
Any tips on how to debug or where to look would be greatly appreciated. Did I mention that this is happening in production? Did I mention it started a week ago, and we just noticed this morning? :(
Additional info: we found some errors in the Kafka state change log (partition 9 is one of the affected partitions and the timestamp in the log looks to be about the time that the problem started)
kafka.common.NoReplicaOnlineException: No replica for partition
[transcription-results,9] is alive. Live brokers are: [Set()], Assigned replicas are: [List(1, 4, 0)]
[2018-03-14 03:11:40,863] TRACE Controller 0 epoch 44 changed state of replica 1 for partition [transcription-results,9] from OnlineReplica to OfflineReplica (state.change.logger)
[2018-03-14 03:11:41,141] TRACE Controller 0 epoch 44 sending become-follower LeaderAndIsr request (Leader:-1,ISR:0,4,LeaderEpoch:442,ControllerEpoch:44) to broker 4 for partition [transcription-results,9] (state.change.logger)
[2018-03-14 03:11:41,145] TRACE Controller 0 epoch 44 sending become-follower LeaderAndIsr request (Leader:-1,ISR:0,4,LeaderEpoch:442,ControllerEpoch:44) to broker 0 for partition [transcription-results,9] (state.change.logger)
[2018-03-14 03:11:41,208] TRACE Controller 0 epoch 44 changed state of replica 4 for partition [transcription-results,9] from OnlineReplica to OnlineReplica (state.change.logger)
[2018-03-14 03:11:41,218] TRACE Controller 0 epoch 44 changed state of replica 1 for partition [transcription-results,9] from OfflineReplica to OnlineReplica (state.change.logger)
[2018-03-14 03:11:41,226] TRACE Controller 0 epoch 44 sending become-follower LeaderAndIsr request (Leader:-1,ISR:0,4,LeaderEpoch:442,ControllerEpoch:44) to broker 4 for partition [transcription-results,9] (state.change.logger)
[2018-03-14 03:11:41,230] TRACE Controller 0 epoch 44 sending become-follower LeaderAndIsr request (Leader:-1,ISR:0,4,LeaderEpoch:442,ControllerEpoch:44) to broker 1 for partition [transcription-results,9] (state.change.logger)
[2018-03-14 03:11:41,450] TRACE Broker 0 received LeaderAndIsr request (LeaderAndIsrInfo:Leader:-1,ISR:0,4,LeaderEpoch:442,ControllerEpoch:44),ReplicationFactor:3),AllReplicas:1,4,0) correlation id 158 from controller 0 epoch 44 for partition [transcription-results,9] (state.change.logger)
[2018-03-14 03:11:41,454] TRACE Broker 0 handling LeaderAndIsr request correlationId 158 from controller 0 epoch 44 starting the become-follower transition for partition [transcription-results,9] (state.change.logger)
[2018-03-14 03:11:41,455] ERROR Broker 0 received LeaderAndIsrRequest with correlation id 158 from controller 0 epoch 44 for partition [transcription-results,9] but cannot become follower since the new leader -1 is unavailable. (state.change.logger)
//... removed some TRACE statements here
[2018-03-14 03:11:41,908] WARN Broker 0 ignoring LeaderAndIsr request from controller 1 with correlation id 1 epoch 47 for partition [transcription-results,9] since its associated leader epoch 441 is old. Current leader epoch is 441 (state.change.logger)
[2018-03-14 03:11:41,982] TRACE Broker 0 cached leader info (LeaderAndIsrInfo:Leader:1,ISR:0,1,4,LeaderEpoch:441,ControllerEpoch:44),ReplicationFactor:3),AllReplicas:1,4,0) for partition [transcription-results,9] in response to UpdateMetadata request sent by controller 1 epoch 47 with correlation id 2 (state.change.logger)
[2018-03-22 14:43:36,098] TRACE Broker 0 received LeaderAndIsr request (LeaderAndIsrInfo:Leader:-1,ISR:,LeaderEpoch:444,ControllerEpoch:47),ReplicationFactor:3),AllReplicas:1,4,0) correlation id 679 from controller 1 epoch 47 for partition [transcription-results,9] (state.change.logger)
Possibly caused by this bug: https://issues.apache.org/jira/browse/KAFKA-3963
But how can we recover from it?

I'd start by looking in Kafka's Zookeeper under /brokers/topics to verify that all partitions are listed. That's where storm-kafka reads the partitions from.

Related

Kafka : How to deal with corrupted __consumer_offset?

I'm running a 3 nodes kafka/zookeeper cluster (with logstash as a consumer). I get this error spammed in the logs in all the 3 nodes :
org.apache.kafka.common.errors.CorruptRecordException: Record size 0 is less than the minimum record overhead (14)
[2022-07-06 13:42:59,538] ERROR [ReplicaFetcher replicaId=23, leaderId=24, fetcherId=0] Found invalid messages during fetch for partition __consumer_offsets-3 offset 169799163 (kafka.server.ReplicaFetcherThread)
org.apache.kafka.common.errors.CorruptRecordException: Record size 0 is less than the minimum record overhead (14)
[2022-07-06 13:42:59,538] ERROR [ReplicaFetcher replicaId=23, leaderId=24, fetcherId=0] Found invalid messages during fetch for partition __consumer_offsets-20 offset 124408988 (kafka.server.ReplicaFetcherThread)
org.apache.kafka.common.errors.CorruptRecordException: Record size 0 is less than the minimum record overhead (14)
[2022-07-06 13:43:00,694] ERROR [ReplicaFetcher replicaId=23, leaderId=24, fetcherId=0] Found invalid messages during fetch for partition __consumer_offsets-9 offset 171772074 (kafka.server.ReplicaFetcherThread)
This 3 offets are always the same and never changes. It seems like partitions 3, 9 and 20 are stuck indefinitely cause of bad offsets.
Additionnal infos : __consumer_offset is set to 2 replicas and 50 partitions in total.
Any idea how can i fix this ? Rebooting kafka/the cluster does not do anything.
Thanks in advance !

mysql table record not being consumed by Kafka

I just started learning kafka and I am running kafka 2.13-2.80 on windows server 2012 R2. I started zookeeper using the following:
zookeeper-server-start.bat ../../config/zookeeper.properties
I started kafka using the following:
kafka-server-start.bat ../../config/server.properties
I started a connector with the following:
connect-standalone.bat ../../config/connect-standalone.properties ../../config/mysql.properties
The content of my mysql.properties file is as follows:
name=test-source-mysql-jdbc-autoincrement
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
tasks.max=1
connection.url=jdbc:mysql://127.0.0.1:3306/DBName?user=username&password=userpassword
mode=incrementing
incrementing.column.name=id
topic.prefix=test-mysql-jdbc-
I started a consumer with and without a partition option:
kafka-console-consumer.bat -topic test-mysql-jdbc-groups -bootstrap-server localhost:9092 -from-beginning [-partition 0]
All seemingly started without issues but when I add a record to my mysql table called groups, I do not see it in my consumer. I checked all the various logs. The only error messages I saw were in the state-change.log and they looked like the following:
ERROR [Broker id=0] Ignoring StopReplica request (delete=true) from controller 0 with correlation id 5 epoch 1 for partition mytopic-2 as the local replica for the partition is in an offline log directory (state.change.logger)
ERROR [Broker id=0] Ignoring StopReplica request (delete=true) from controller 0 with correlation id 5 epoch 1 for partition mytopic-1 as the local replica for the partition is in an offline log directory (state.change.logger)
ERROR [Broker id=0] Ignoring StopReplica request (delete=true) from controller 0 with correlation id 5 epoch 1 for partition mytopic-0 as the local replica for the partition is in an offline log directory (state.change.logger)
ERROR [Broker id=0] Received LeaderAndIsrRequest with correlation id 1 from controller 0 epoch 2 for partition mytopic-0 (last update controller epoch 1) but cannot become follower since the new leader -1 is unavailable. (state.change.logger)
ERROR [Broker id=0] Received LeaderAndIsrRequest with correlation id 1 from controller 0 epoch 2 for partition mytopic-1 (last update controller epoch 1) but cannot become follower since the new leader -1 is unavailable. (state.change.logger)
ERROR [Broker id=0] Received LeaderAndIsrRequest with correlation id 1 from controller 0 epoch 2 for partition mytopic-2 (last update controller epoch 1) but cannot become follower since the new leader -1 is unavailable. (state.change.logger)
I also notice this message in zookeeper
INFO Expiring session timeout of exceeded (org.apache.zookeeper.server.ZooKeeperServer)
Please could anyone give me pointers as to what I could be doing wrong? Thanks

Missing messages on Kafka's compacted topic

I have a topic that is compacted:
/opt/kafka/bin/kafka-topics.sh --zookeeper localhost --describe --topic myTopic
Topic:myTopic PartitionCount:1 ReplicationFactor:1 Configs:cleanup.policy=compact
There are no messages on it:
/opt/kafka/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic myTopic --from-beginning --property print-key=true
^CProcessed a total of 0 messages
Both the earliest and latest offset on the only partition that's there is 12, though.
/opt/kafka/bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:9092 --topic myTopic --time -2
myTopic:0:12
/opt/kafka/bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:9092 --topic myTopic --time -1
myTopic:0:12
I wonder what could have happened with these 12 messages? The number is correct, I was expecting them to be there, but for some reason they're gone.
As far as I understand, even if these 12 messages had the same key, I should have seen at least one - that's how the compaction works.
The topic in question was created as compacted. The only weird thing that might have happened during that time is that the Kafka instance lost its Zookeeper data completely. Is it possible that it also caused the data loss?
To rephrase the last question: can something bad happen with the physical data on Kafka if I remove all the Kafka-related ZNodes on Zookeeper?
In addition, here are some logs from Kafka startup.
[2019-04-30 12:02:16,510] WARN [Log partition=myTopic-0, dir=/var/lib/kafka] Found a corrupted index file corresponding to log file /var/lib/kafka/myTopic-0/00000000000000000000.log due to Corrupt index found, index file (/var/lib/kafka/myTopic-0/00000000000000000000.index) has non-zero size but the last offset is 0 which is no greater than the base offset 0.}, recovering segment and rebuilding index files... (kafka.log.Log)
[2019-04-30 12:02:16,524] INFO [Log partition=myTopic-0, dir=/var/lib/kafka] Completed load of log with 1 segments, log start offset 0 and log end offset 12 in 16 ms (kafka.log.Log)
[2019-04-30 12:35:34,530] INFO Got user-level KeeperException when processing sessionid:0x16a6e1ea2000001 type:setData cxid:0x1406 zxid:0xd11 txntype:-1 reqpath:n/a Error Path:/config/topics/myTopic Error:KeeperErrorCode = NoNode for /config/topics/myTopic (org.apache.zookeeper.server.PrepRequestProcessor)
[2019-04-30 12:35:34,535] INFO Topic creation Map(myTopic-0 -> ArrayBuffer(0)) (kafka.zk.AdminZkClient)
[2019-04-30 12:35:34,547] INFO [ReplicaFetcherManager on broker 0] Removed fetcher for partitions myTopic-0
(kafka.server.ReplicaFetcherManager)
[2019-04-30 12:35:34,580] INFO [Partition myTopic-0 broker=0] No checkpointed highwatermark is found for partition myTopic-0 (kafka.cluster.Partition)
[2019-04-30 12:35:34,580] INFO Replica loaded for partition myTopic-0 with initial high watermark 0 (kafka.cluster.Replica)
[2019-04-30 12:35:34,580] INFO [Partition myTopic-0 broker=0] myTopic-0 starts at Leader Epoch 0 from offset 12. Previous Leader Epoch was: -1 (kafka.cluster.Partition)
And the messages were indeed removed:
[2019-04-30 12:39:24,199] INFO [Log partition=myTopic-0, dir=/var/lib/kafka] Found deletable segments with base offsets [0] due to retention time 10800000ms breach (kafka.log.Log)
[2019-04-30 12:39:24,201] INFO [Log partition=myTopic-0, dir=/var/lib/kafka] Rolled new log segment at offset 12 in 2 ms. (kafka.log.Log)
NoNode for /config/topics/myTopic
Kafka no longer knows this topic exists and that it should be compacted, which seems to be evident by the log cleaner logs
due to retention time 10800000ms breach
So yes, Zookeeper is very important. But so is gracefully shutting down a broker with kafka-server-stop, otherwise forcibly killing the process or the host machine would end up with corrupted partition segments
I'm not entirely sure what conditions would lead to this
the last offset is 0 which is no greater than the base offset 0
But assuming that you had a full cluster and that the topic had a replication factor higher than 1, then you could hope that at least one replica were healthy.
The way to recover a broker with a corrupted index/partition would be stop kafka process, delete the corrupted partition folder from disk, restart kafka on that machine, and then let it replicate back from a healthy instance

Kafka on Kubernetes - Clients are unable to retrieve metadata

I have a Kafka cluster running on Kubernetes along with ZooKeeper on Kubernetes. As outlined in this answer, I have configured the internal broker port as well as the advertised external port for the clients:
listener.security.protocol.map=PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
inter.broker.listener.name=PLAINTEXT
listeners=PLAINTEXT://:29092,PLAINTEXT_HOST://0.0.0.0:9093
advertised.listeners=PLAINTEXT://:29092,PLAINTEXT_HOST://{EXTERNAL-IP-ADDRESS}:9093
zookeeper.connect=zk-cs.analytics.svc:2181
I expect the inter-broker communication to happen on 29092. External clients should be able to connect on port 9093.
I have one external IP for the entire Kubernetes service, which means that this is the only external IP that should be exposed from the Kafka brokers. As far as I understand, the Kubernetes load balancer will route any request to this IP to one of my brokers.
I have validated that my kafka brokers registered correctly to ZooKeeper:
get /brokers/ids/0
{"listener_security_protocol_map":{"PLAINTEXT":"PLAINTEXT","PLAINTEXT_HOST":"PLAINTEXT"},"endpoints":["PLAINTEXT://kafka-0.kafka-hs.analytics.svc.cluster.local:29092","PLAINTEXT_HOST://{EXTERNAL-IP-ADDRESS}"],"jmx_port":-1,"host":"kafka-0.kafka-hs.analytics.svc.cluster.local","timestamp":"1525689391350","port":29092,"version":4}
cZxid = 0x90000029f
ctime = Mon May 07 12:36:31 CEST 2018
mZxid = 0x90000029f
mtime = Mon May 07 12:36:31 CEST 2018
pZxid = 0x90000029f
cversion = 0
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x1632acfab520009
dataLength = 344
numChildren = 0
Creating a topic looks good in the logs to me, logs are below.
Primary:
[2018-05-07 10:41:12,760] DEBUG [TopicChangeListener on Controller 0]: Topic change listener fired for path /brokers/topics with children test-topic (kafka.controller.PartitionStateMachine$TopicChangeListener)
[2018-05-07 10:41:12,767] INFO [TopicChangeListener on Controller 0]: New topics: [Set(test-topic)], deleted topics: [Set()], new partition replica assignment [Map([test-topic,0] -> List(0, 1))] (kafka.controller.PartitionStateMachine$TopicChangeListener)
[2018-05-07 10:41:12,768] INFO [Controller 0]: New topic creation callback for [test-topic,0] (kafka.controller.KafkaController)
[2018-05-07 10:41:12,770] INFO [Controller 0]: New partition creation callback for [test-topic,0] (kafka.controller.KafkaController)
[2018-05-07 10:41:12,771] INFO [Partition state machine on Controller 0]: Invoking state change to NewPartition for partitions [test-topic,0] (kafka.controller.PartitionStateMachine)
[2018-05-07 10:41:12,772] TRACE Controller 0 epoch 12 changed partition [test-topic,0] state from NonExistentPartition to NewPartition with assigned replicas 0,1 (state.change.logger)
[2018-05-07 10:41:12,774] INFO [Replica state machine on controller 0]: Invoking state change to NewReplica for replicas [Topic=test-topic,Partition=0,Replica=0],[Topic=test-topic,Partition=0,Replica=1] (kafka.controller.ReplicaStateMachine)
[2018-05-07 10:41:12,778] TRACE Controller 0 epoch 12 changed state of replica 0 for partition [test-topic,0] from NonExistentReplica to NewReplica (state.change.logger)
[2018-05-07 10:41:12,779] TRACE Controller 0 epoch 12 changed state of replica 1 for partition [test-topic,0] from NonExistentReplica to NewReplica (state.change.logger)
[2018-05-07 10:41:12,779] INFO [Partition state machine on Controller 0]: Invoking state change to OnlinePartition for partitions [test-topic,0] (kafka.controller.PartitionStateMachine)
[2018-05-07 10:41:12,780] DEBUG [Partition state machine on Controller 0]: Live assigned replicas for partition [test-topic,0] are: [List(0, 1)] (kafka.controller.PartitionStateMachine)
[2018-05-07 10:41:12,782] DEBUG [Partition state machine on Controller 0]: Initializing leader and isr for partition [test-topic,0] to (Leader:0,ISR:0,1,LeaderEpoch:0,ControllerEpoch:12) (kafka.controller.PartitionStateMachine)
[2018-05-07 10:41:12,805] TRACE Controller 0 epoch 12 changed partition [test-topic,0] from NewPartition to OnlinePartition with leader 0 (state.change.logger)
[2018-05-07 10:41:12,806] TRACE Controller 0 epoch 12 sending become-follower LeaderAndIsr request (Leader:0,ISR:0,1,LeaderEpoch:0,ControllerEpoch:12) to broker 1 for partition [test-topic,0] (state.change.logger)
[2018-05-07 10:41:12,809] TRACE Controller 0 epoch 12 sending become-leader LeaderAndIsr request (Leader:0,ISR:0,1,LeaderEpoch:0,ControllerEpoch:12) to broker 0 for partition [test-topic,0] (state.change.logger)
[2018-05-07 10:41:12,810] TRACE Controller 0 epoch 12 sending UpdateMetadata request (Leader:0,ISR:0,1,LeaderEpoch:0,ControllerEpoch:12) to brokers Set(0, 1, 2, 3, 4) for partition test-topic-0 (state.change.logger)
[2018-05-07 10:41:12,811] INFO [Replica state machine on controller 0]: Invoking state change to OnlineReplica for replicas [Topic=test-topic,Partition=0,Replica=0],[Topic=test-topic,Partition=0,Replica=1] (kafka.controller.ReplicaStateMachine)
[2018-05-07 10:41:12,812] TRACE Controller 0 epoch 12 changed state of replica 0 for partition [test-topic,0] from NewReplica to OnlineReplica (state.change.logger)
[2018-05-07 10:41:12,813] TRACE Controller 0 epoch 12 changed state of replica 1 for partition [test-topic,0] from NewReplica to OnlineReplica (state.change.logger)
[2018-05-07 10:41:12,813] TRACE Broker 0 received LeaderAndIsr request PartitionState(controllerEpoch=12, leader=0, leaderEpoch=0, isr=[0, 1], zkVersion=0, replicas=[0, 1]) correlation id 5 from controller 0 epoch 12 for partition [test-topic,0] (state.change.logger)
[2018-05-07 10:41:12,813] TRACE Broker 0 received LeaderAndIsr request PartitionState(controllerEpoch=12, leader=0, leaderEpoch=0, isr=[0, 1], zkVersion=0, replicas=[0, 1]) correlation id 4 from controller 0 epoch 12 for partition [test-topic,0] (state.change.logger)
[2018-05-07 10:41:12,816] TRACE Broker 0 cached leader info (LeaderAndIsrInfo:(Leader:0,ISR:0,1,LeaderEpoch:0,ControllerEpoch:12),ReplicationFactor:2),AllReplicas:0,1) for partition test-topic-0 in response to UpdateMetadata request sent by controller 0 epoch 12 with correlation id 2 (state.change.logger)
[2018-05-07 10:41:12,817] TRACE Controller 0 epoch 12 received response {error_code=0} for a request sent to broker kafka-2.kafka-hs.analytics.svc.cluster.local:29092 (id: 2 rack: null) (state.change.logger)
[2018-05-07 10:41:12,823] TRACE Controller 0 epoch 12 received response {error_code=0} for a request sent to broker kafka-3.kafka-hs.analytics.svc.cluster.local:29092 (id: 3 rack: null) (state.change.logger)
[2018-05-07 10:41:12,823] TRACE Controller 0 epoch 12 received response {error_code=0} for a request sent to broker kafka-4.kafka-hs.analytics.svc.cluster.local:29092 (id: 4 rack: null) (state.change.logger)
[2018-05-07 10:41:12,827] TRACE Broker 0 handling LeaderAndIsr request correlationId 4 from controller 0 epoch 12 starting the become-leader transition for partition test-topic-0 (state.change.logger)
[2018-05-07 10:41:12,828] INFO [ReplicaFetcherManager on broker 0] Removed fetcher for partitions test-topic-0 (kafka.server.ReplicaFetcherManager)
[2018-05-07 10:41:12,852] INFO Completed load of log test-topic-0 with 1 log segments and log end offset 0 in 17 ms (kafka.log.Log)
[2018-05-07 10:41:12,853] INFO Created log for partition [test-topic,0] in /tmp/kafka-logs with properties {compression.type -> producer, message.format.version -> 0.10.2-IV0, file.delete.delay.ms -> 60000, max.message.bytes -> 1000012, min.compaction.lag.ms -> 0, message.timestamp.type -> CreateTime, min.insync.replicas -> 1, segment.jitter.ms -> 0, preallocate -> false, min.cleanable.dirty.ratio -> 0.5, index.interval.bytes -> 4096, unclean.leader.election.enable -> true, retention.bytes -> -1, delete.retention.ms -> 86400000, cleanup.policy -> [delete], flush.ms -> 9223372036854775807, segment.ms -> 604800000, segment.bytes -> 1073741824, retention.ms -> 604800000, message.timestamp.difference.max.ms -> 9223372036854775807, segment.index.bytes -> 10485760, flush.messages -> 9223372036854775807}. (kafka.log.LogManager)
[2018-05-07 10:41:12,853] INFO Partition [test-topic,0] on broker 0: No checkpointed highwatermark is found for partition test-topic-0 (kafka.cluster.Partition)
[2018-05-07 10:41:12,861] TRACE Broker 0 stopped fetchers as part of become-leader request from controller 0 epoch 12 with correlation id 4 for partition test-topic-0 (state.change.logger)
[2018-05-07 10:41:12,861] TRACE Broker 0 completed LeaderAndIsr request correlationId 4 from controller 0 epoch 12 for the become-leader transition for partition test-topic-0 (state.change.logger)
[2018-05-07 10:41:12,864] WARN Broker 0 ignoring LeaderAndIsr request from controller 0 with correlation id 5 epoch 12 for partition [test-topic,0] since its associated leader epoch 0 is not higher than the current leader epoch 0 (state.change.logger)
[2018-05-07 10:41:12,865] TRACE Controller 0 epoch 12 received response {error_code=0,partitions=[{topic=test-topic,partition=0,error_code=11}]} for a request sent to broker kafka-0.kafka-hs.analytics.svc.cluster.local:29092 (id: 0 rack: null) (state.change.logger)
[2018-05-07 10:41:12,865] TRACE Controller 0 epoch 12 received response {error_code=0,partitions=[{topic=test-topic,partition=0,error_code=0}]} for a request sent to broker kafka-1.kafka-hs.analytics.svc.cluster.local:29092 (id: 1 rack: null) (state.change.logger)
[2018-05-07 10:41:12,867] TRACE Broker 0 cached leader info (LeaderAndIsrInfo:(Leader:0,ISR:0,1,LeaderEpoch:0,ControllerEpoch:12),ReplicationFactor:2),AllReplicas:0,1) for partition test-topic-0 in response to UpdateMetadata request sent by controller 0 epoch 12 with correlation id 6 (state.change.logger)
[2018-05-07 10:41:12,867] TRACE Controller 0 epoch 12 received response {error_code=0} for a request sent to broker kafka-0.kafka-hs.analytics.svc.cluster.local:29092 (id: 0 rack: null) (state.change.logger)
[2018-05-07 10:41:12,867] TRACE Broker 0 cached leader info (LeaderAndIsrInfo:(Leader:0,ISR:0,1,LeaderEpoch:0,ControllerEpoch:12),ReplicationFactor:2),AllReplicas:0,1) for partition test-topic-0 in response to UpdateMetadata request sent by controller 0 epoch 12 with correlation id 5 (state.change.logger)
[2018-05-07 10:41:12,868] TRACE Controller 0 epoch 12 received response {error_code=0} for a request sent to broker kafka-1.kafka-hs.analytics.svc.cluster.local:29092 (id: 1 rack: null) (state.change.logger)
[2018-05-07 10:41:26,213] INFO Partition [test-topic,0] on broker 0: Shrinking ISR for partition [test-topic,0] from 0,1 to 0 (kafka.cluster.Partition)
[2018-05-07 10:41:28,721] DEBUG [IsrChangeNotificationListener on Controller 0]: ISR change notification listener fired (kafka.controller.IsrChangeNotificationListener)
[2018-05-07 10:41:28,735] DEBUG [IsrChangeNotificationListener on Controller 0]: Sending MetadataRequest to Brokers:ArrayBuffer(0, 1, 2, 3, 4) for TopicAndPartitions:Set([test-topic,0], [__consumer_offsets,30], [__consumer_offsets,6]) (kafka.controller.IsrChangeNotificationListener)
[2018-05-07 10:41:28,735] INFO Leader not yet assigned for partition [__consumer_offsets,30]. Skip sending UpdateMetadataRequest. (kafka.controller.ControllerBrokerRequestBatch)
[2018-05-07 10:41:28,735] INFO Leader not yet assigned for partition [__consumer_offsets,6]. Skip sending UpdateMetadataRequest. (kafka.controller.ControllerBrokerRequestBatch)
[2018-05-07 10:41:28,735] TRACE Controller 0 epoch 12 sending UpdateMetadata request (Leader:0,ISR:0,LeaderEpoch:0,ControllerEpoch:12) to brokers Set(0, 1, 2, 3, 4) for partition test-topic-0 (state.change.logger)
[2018-05-07 10:41:28,739] TRACE Broker 0 cached leader info (LeaderAndIsrInfo:(Leader:0,ISR:0,LeaderEpoch:0,ControllerEpoch:12),ReplicationFactor:2),AllReplicas:0,1) for partition test-topic-0 in response to UpdateMetadata request sent by controller 0 epoch 12 with correlation id 6 (state.change.logger)
[2018-05-07 10:41:28,739] TRACE Broker 0 cached leader info (LeaderAndIsrInfo:(Leader:0,ISR:0,LeaderEpoch:0,ControllerEpoch:12),ReplicationFactor:2),AllReplicas:0,1) for partition test-topic-0 in response to UpdateMetadata request sent by controller 0 epoch 12 with correlation id 3 (state.change.logger)
[2018-05-07 10:41:28,739] TRACE Controller 0 epoch 12 received response {error_code=0} for a request sent to broker kafka-1.kafka-hs.analytics.svc.cluster.local:29092 (id: 1 rack: null) (state.change.logger)
[2018-05-07 10:41:28,740] TRACE Broker 0 cached leader info (LeaderAndIsrInfo:(Leader:0,ISR:0,LeaderEpoch:0,ControllerEpoch:12),ReplicationFactor:2),AllReplicas:0,1) for partition test-topic-0 in response to UpdateMetadata request sent by controller 0 epoch 12 with correlation id 7 (state.change.logger)
[2018-05-07 10:41:28,740] TRACE Controller 0 epoch 12 received response {error_code=0} for a request sent to broker kafka-2.kafka-hs.analytics.svc.cluster.local:29092 (id: 2 rack: null) (state.change.logger)
[2018-05-07 10:41:28,740] TRACE Controller 0 epoch 12 received response {error_code=0} for a request sent to broker kafka-3.kafka-hs.analytics.svc.cluster.local:29092 (id: 3 rack: null) (state.change.logger)
[2018-05-07 10:41:28,740] TRACE Controller 0 epoch 12 received response {error_code=0} for a request sent to broker kafka-0.kafka-hs.analytics.svc.cluster.local:29092 (id: 0 rack: null) (state.change.logger)
[2018-05-07 10:41:28,741] TRACE Controller 0 epoch 12 received response {error_code=0} for a request sent to broker kafka-4.kafka-hs.analytics.svc.cluster.local:29092 (id: 4 rack: null) (state.change.logger)
[2018-05-07 10:41:28,746] DEBUG [IsrChangeNotificationListener on Controller 0]: ISR change notification listener fired (kafka.controller.IsrChangeNotificationListener)
[2018-05-07 10:41:36,297] TRACE [Controller 0]: checking need to trigger partition rebalance (kafka.controller.KafkaController)
[2018-05-07 10:41:36,298] DEBUG [Controller 0]: preferred replicas by broker Map(0 -> Map([test-topic,0] -> List(0, 1))) (kafka.controller.KafkaController)
[2018-05-07 10:41:36,302] DEBUG [Controller 0]: topics not in preferred replica Map() (kafka.controller.KafkaController)
[2018-05-07 10:41:36,303] TRACE [Controller 0]: leader imbalance ratio for broker 0 is 0.000000 (kafka.controller.KafkaController)
Replica #1:
[2018-05-07 10:41:12,822] TRACE Broker 1 cached leader info (LeaderAndIsrInfo:(Leader:0,ISR:0,1,LeaderEpoch:0,ControllerEpoch:12),ReplicationFactor:2),AllReplicas:0,1) for partition test-topic-0 in response to UpdateMetadata request sent by controller 0 epoch 12 with correlation id 3 (state.change.logger)
[2018-05-07 10:41:28,739] TRACE Broker 1 cached leader info (LeaderAndIsrInfo:(Leader:0,ISR:0,LeaderEpoch:0,ControllerEpoch:12),ReplicationFactor:2),AllReplicas:0,1) for partition test-topic-0 in response to UpdateMetadata request sent by controller 0 epoch 12 with correlation id 4 (state.change.logger)
Replica #2:
[2018-05-07 10:41:12,823] TRACE Broker 2 cached leader info (LeaderAndIsrInfo:(Leader:0,ISR:0,1,LeaderEpoch:0,ControllerEpoch:12),ReplicationFactor:2),AllReplicas:0,1) for partition test-topic-0 in response to UpdateMetadata request sent by controller 0 epoch 12 with correlation id 1 (state.change.logger)
[2018-05-07 10:41:28,740] TRACE Broker 2 cached leader info (LeaderAndIsrInfo:(Leader:0,ISR:0,LeaderEpoch:0,ControllerEpoch:12),ReplicationFactor:2),AllReplicas:0,1) for partition test-topic-0 in response to UpdateMetadata request sent by controller 0 epoch 12 with correlation id 2 (state.change.logger)
However, whenever I connect with the console producer to the cluster, I get the following error:
.\kafka-console-producer.bat --broker-list {EXTERNAL-IP-ADDRESS}:9093 --topic test-topic --property parse.key=true --property key.separator=:
>testKey:23487239847237894asduhzdfhzusfhhsdf
[2018-05-07 12:42:58,395] WARN [Producer clientId=console-producer] Error while fetching metadata with correlation id 2 : {test-topic=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
[2018-05-07 12:42:58,512] WARN [Producer clientId=console-producer] Error while fetching metadata with correlation id 3 : {test-topic=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
[2018-05-07 12:42:58,641] WARN [Producer clientId=console-producer] Error while fetching metadata with correlation id 4 : {test-topic=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
[2018-05-07 12:42:58,765] WARN [Producer clientId=console-producer] Error while fetching metadata with correlation id 5 : {test-topic=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
[2018-05-07 12:42:58,886] WARN [Producer clientId=console-producer] Error while fetching metadata with correlation id 6 : {test-topic=LEADER_NOT_AVAILABLE} (org.apache.kafka.clients.NetworkClient)
Is it a problem that a Kubernetes Service usually exposes one external IP address and all of my Kafka brokers are advertising this IP? Are there solutions to this?
I expect the inter-broker communication to happen on 29092.
Yes, they use 29092 for internal communication.
External clients should be able to connect on port 9093. I have one external IP for the entire Kubernetes service, which means that this is the only external IP that should be exposed from the Kafka brokers. As far as I understand, the Kubernetes load balancer will route any request to this IP to one of my brokers.
Yes, Kubernetes will route all traffic from that service to one of your brokers and that is a problem.
Internally, you use Headless Service to discover addresses of your Kafka brokers, so they are available by DNS names kafka-[_NUM_OF_THE_REPLICA_]._SERVICE_NAME_ and it works without any problems.
For access from the outside of the cluster, you need to expose all your replicas on the different addresses or ports. But, you have only one service which can balance requests between services.
To fix it, you should create a separate service for each replica and use themes external addresses as EXTERNAL-IP-ADDRESSES in your configuration.
Here is an example from the issue in GitHub repo where you get a configuration of Kafka cluster for Kubernetes:
---
apiVersion: v1
kind: Service
metadata:
name: kafka-es-0
spec:
ports:
- port: 9092
name: kafka-port
protocol: TCP
selector:
pod-name: kafka-0
type: LoadBalancer
---
apiVersion: v1
kind: Service
metadata:
name: kafka-es-1
spec:
ports:
- port: 9092
name: kafka-port
protocol: TCP
selector:
pod-name: kafka-1
type: LoadBalancer

Kafka broker constantly ISR shrinking and expanding?

We have a cluster of 4 nodes in production. We observed that one of the
nodes ran into a situation where it constantly shrunk and expanded ISR for
more than 1 hours and unable to recover until the broker was bounced.
[2017-02-21 14:52:16,518] INFO Partition [skynet-large-stage,5] on broker 0: Shrinking ISR for partition [skynet-large-stage,5] from 2,0 to 0 (kafka.cluster.Partition)
[2017-02-21 14:52:16,543] INFO Partition [skynet-large-stage,37] on broker 0: Shrinking ISR for partition [skynet-large-stage,37] from 1,0 to 0 (kafka.cluster.Partition)
[2017-02-21 14:52:16,544] INFO Partition [skynet-large-stage,13] on broker 0: Shrinking ISR for partition [skynet-large-stage,13] from 1,0 to 0 (kafka.cluster.Partition)
[2017-02-21 14:52:16,545] INFO Partition [__consumer_offsets,46] on broker 0: Shrinking ISR for partition [__consumer_offsets,46] from 3,2,0 to 3,0 (kafka.cluster.Partition)
.
.
I'd like to know what would cause this issue and why the broken broker was not kicked out of ISR.
Kafka version is 0.10.1.0
There was that bug in KAFKA-4477 that got fixed, but in general, I've seen this same problem when Kafka brokers time out when talking to a zookeeper node (default is 6000ms timeout), for some transient network blip, at which point they get kicked out of the cluster, partition leadership changes, clients have to rebalance, etc. For high volume clusters, it's a pain.
Simply increasing this timeout has helped me several times before:
zookeeper.session.timeout.ms
The default value according to the official docs is 6000ms. I found simply increasing it to 15000ms caused the cluster to be rock solid.
Documentation for 0.11.0 Kafka version: https://kafka.apache.org/0110/documentation.html