Consumer group can't rebalance - apache-kafka

I am learning kafka and now I am having some problems with the example code i've found here: https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example
Each time I run the code it throws the exception:
Exception in thread "main" kafka.common.ConsumerRebalanceFailedException: ttt_NB644-1475151991986-76dfa03f can't rebalance after 4 retries
at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:670)
at kafka.consumer.ZookeeperConsumerConnector.kafka$consumer$ZookeeperConsumerConnector$$reinitializeConsumer(ZookeeperConsumerConnector.scala:977)
at kafka.consumer.ZookeeperConsumerConnector.consume(ZookeeperConsumerConnector.scala:264)
at kafka.javaapi.consumer.ZookeeperConsumerConnector.createMessageStreams(ZookeeperConsumerConnector.scala:85)
at kafka.javaapi.consumer.ZookeeperConsumerConnector.createMessageStreams(ZookeeperConsumerConnector.scala:97)
at com.glowbyte.kafka.consumertest.ConsumerGroupExample.run(ConsumerGroupExample.java:44)
at com.glowbyte.kafka.consumertest.ConsumerGroupExample.main(ConsumerGroupExample.java:78)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
The kafka version which I'm using is 0.10, the latest one.
There is only one topic with one broker and two partitions, and I'm trying to run the code with 2 threads.
In the meantime, another code, just more simple, runs successfully on the same environment with also 2 threads. So I'd like to understand what's causing the described exception. Thanks.

Related

Kafka CommitFailedException occurs when a breakpoint is used on a method annotated with #KafkaListener

I do not understand why I get the CommitFailedException when I use a breakpoint on the method which is annotated with #KafkaListener.
I know for sure that I am not exceeding the metadata.max.age.ms = 300000 of the consumer and further more I am using max.poll.records = 1.
It seams like Heartbeat Thread is timing out but my understanding is that the Heartbeat Thread is independent from the poll thread.
I see that the following is condition is true in the AbstractCoordinator class a so the markCoordinatorUnknown() method is executed
else if (AbstractCoordinator.this.heartbeat.sessionTimeoutExpired(now)) {
AbstractCoordinator.this.markCoordinatorUnknown();
}
I am using a spring-boot 2.3.5 which comes with kafka-clients-2.5.1 and IntelliJ 2019
Apologies if I do not provided detailed information about the entire setup but my goal is to see if other developers experience the same issue.
In production this issue it does not happen since (of course) the application is not running in debug mode :-)
Following is the error in the logs:
2021-03-03 12:26:13.830 ERROR 664 --- [ntainer#0-0-C-1] essageListenerContainer$ListenerConsumer : Consumer exception
java.lang.IllegalStateException: This error handler cannot process 'org.apache.kafka.clients.consumer.CommitFailedException's; no record information is available
at org.springframework.kafka.listener.SeekUtils.seekOrRecover(SeekUtils.java:151) ~[spring-kafka-2.5.7.RELEASE.jar:2.5.7.RELEASE]
at org.springframework.kafka.listener.SeekToCurrentErrorHandler.handle(SeekToCurrentErrorHandler.java:113) ~[spring-kafka-2.5.7.RELEASE.jar:2.5.7.RELEASE]
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.handleConsumerException(KafkaMessageListenerContainer.java:1368) ~[spring-kafka-2.5.7.RELEASE.jar:2.5.7.RELEASE]
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.run(KafkaMessageListenerContainer.java:1070) ~[spring-kafka-2.5.7.RELEASE.jar:2.5.7.RELEASE]
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[na:na]
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[na:na]
at java.base/java.lang.Thread.run(Thread.java:834) ~[na:na]
Caused by: org.apache.kafka.clients.consumer.CommitFailedException: Offset commit cannot be completed since the consumer is not part of an active group for auto partition assignment; it is likely that the consumer was kicked out of the group.
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.sendOffsetCommitRequest(ConsumerCoordinator.java:1116) ~[kafka-clients-2.5.1.jar:na]
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.commitOffsetsSync(ConsumerCoordinator.java:983) ~[kafka-clients-2.5.1.jar:na]
at org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1510) ~[kafka-clients-2.5.1.jar:na]
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.doCommitSync(KafkaMessageListenerContainer.java:2324) ~[spring-kafka-2.5.7.RELEASE.jar:2.5.7.RELEASE]
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.commitSync(KafkaMessageListenerContainer.java:2319) ~[spring-kafka-2.5.7.RELEASE.jar:2.5.7.RELEASE]
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.commitIfNecessary(KafkaMessageListenerContainer.java:2305) ~[spring-kafka-2.5.7.RELEASE.jar:2.5.7.RELEASE]
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.processCommits(KafkaMessageListenerContainer.java:2119) ~[spring-kafka-2.5.7.RELEASE.jar:2.5.7.RELEASE]
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.pollAndInvoke(KafkaMessageListenerContainer.java:1104) ~[spring-kafka-2.5.7.RELEASE.jar:2.5.7.RELEASE]
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.run(KafkaMessageListenerContainer.java:1038) ~[spring-kafka-2.5.7.RELEASE.jar:2.5.7.RELEASE]
... 3 common frames omitted

flink checkpoint and kafka producer exactly-once

When I create kafka producer with exactly once, if I also use checkpoint, it will lead to such problem:
java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1260)
at org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:1155)
at org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:1132)
at org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:1111)
at org.apache.flink.streaming.connectors.kafka.internal.FlinkKafkaInternalProducer.close(FlinkKafkaInternalProducer.java:150)
at org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.abortTransactions(FlinkKafkaProducer.java:1093)
at org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.initializeState(FlinkKafkaProducer.java:1031)
at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178)
at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:160)
at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:281)
at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:881)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:395)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:530)
at java.lang.Thread.run(Thread.java:748)
How can I solve it?

Interrupted while joining ioThread / Error during disposal of stream operator in flink application

I have a flink-based streaming application which uses apache kafka sources and sinks. Since some days I am getting exceptions at random times during development, and I have no clue where they're coming from.
I am running the app within IntelliJ using the mainRunner class, and I am feeding it messages via kafka. Sometimes the first message will trigger the errors, sometimes it happens only after a few messages.
This is how it looks:
16:31:01.935 ERROR o.a.k.c.producer.KafkaProducer - Interrupted while joining ioThread
java.lang.InterruptedException: null
at java.lang.Object.wait(Native Method) ~[na:1.8.0_51]
at java.lang.Thread.join(Thread.java:1253) [na:1.8.0_51]
at org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:1031) [kafka-clients-0.11.0.2.jar:na]
at org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:1010) [kafka-clients-0.11.0.2.jar:na]
at org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:989) [kafka-clients-0.11.0.2.jar:na]
at org.apache.flink.streaming.connectors.kafka.internal.FlinkKafkaProducer.close(FlinkKafkaProducer.java:168) [flink-connector-kafka-0.11_2.11-1.6.1.jar:1.6.1]
at org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer011.close(FlinkKafkaProducer011.java:662) [flink-connector-kafka-0.11_2.11-1.6.1.jar:1.6.1]
at org.apache.flink.api.common.functions.util.FunctionUtils.closeFunction(FunctionUtils.java:43) [flink-core-1.6.1.jar:1.6.1]
at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.dispose(AbstractUdfStreamOperator.java:117) [flink-streaming-java_2.11-1.6.1.jar:1.6.1]
at org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:477) [flink-streaming-java_2.11-1.6.1.jar:1.6.1]
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:378) [flink-streaming-java_2.11-1.6.1.jar:1.6.1]
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) [flink-runtime_2.11-1.6.1.jar:1.6.1]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51]
16:31:01.936 ERROR o.a.f.s.runtime.tasks.StreamTask - Error during disposal of stream operator.
org.apache.kafka.common.KafkaException: Failed to close kafka producer
at org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:1062) ~[kafka-clients-0.11.0.2.jar:na]
at org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:1010) ~[kafka-clients-0.11.0.2.jar:na]
at org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:989) ~[kafka-clients-0.11.0.2.jar:na]
at org.apache.flink.streaming.connectors.kafka.internal.FlinkKafkaProducer.close(FlinkKafkaProducer.java:168) ~[flink-connector-kafka-0.11_2.11-1.6.1.jar:1.6.1]
at org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer011.close(FlinkKafkaProducer011.java:662) ~[flink-connector-kafka-0.11_2.11-1.6.1.jar:1.6.1]
at org.apache.flink.api.common.functions.util.FunctionUtils.closeFunction(FunctionUtils.java:43) ~[flink-core-1.6.1.jar:1.6.1]
at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.dispose(AbstractUdfStreamOperator.java:117) ~[flink-streaming-java_2.11-1.6.1.jar:1.6.1]
at org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:477) [flink-streaming-java_2.11-1.6.1.jar:1.6.1]
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:378) [flink-streaming-java_2.11-1.6.1.jar:1.6.1]
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) [flink-runtime_2.11-1.6.1.jar:1.6.1]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_51]
Caused by: java.lang.InterruptedException: null
at java.lang.Object.wait(Native Method) ~[na:1.8.0_51]
at java.lang.Thread.join(Thread.java:1253) [na:1.8.0_51]
at org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:1031) ~[kafka-clients-0.11.0.2.jar:na]
... 10 common frames omitted
16:31:01.938 ERROR o.a.k.c.producer.KafkaProducer - Interrupted while joining ioThread
I get around 10-20 of those, and then it seems like flink recovers the app, and it gets usable again, and I can successfully process messages.
What could possibly cause this? Or how can I analyze further to track this down?
I am using flink version 1.6.1 with scala 2.11 on a mac with IntelliJ beeing version 2018.3.2.
I was able to resolve it. Turned out that one of my stream operators (map-function) was throwing an exception because of some invalid array index.
It was not possible to see this in the logs, only when I step-by-step teared down the application into smaller pieces I finally got this very exception in the logs, and after fixing the obvious bug in the array access, the above mentioned exceptions (java.lang.InterruptedException and org.apache.kafka.common.KafkaException) went away.

Subscribe method is throwing error while trying to access kafka (0.90 version) from kafka (0.10 version)

This is our development environment
1) kafka cluster - version is 0.10
2) Spark cluster - 1.6 which has 0.9 Kafka jars
We are trying to produce() and consume() in spark cluster mode. (via spark-submit)
While running spark-submit job, spark chooses 0.9 version of kafka. The following is our observation
1) Producer – works fine ( 0.9 api and 0.10 api producer is compatible )
2) Streaming Kafka Consumer using KafkaUtils – works fine ( seems here also 0.9 api and 0.10 api producer is compatible)
3) Consumer using subscribe() API – Errors out with the following message. Can someone help us know why is it failing ?
16/10/24 02:31:08 ERROR yarn.ApplicationMaster: User class threw exception: java.lang.NoSuchMethodError: org.apache.kafka.clients.consumer.KafkaConsumer.subscribe(Ljava/util/Collection;)V
java.lang.NoSuchMethodError: org.apache.kafka.clients.consumer.KafkaConsumer.subscribe(Ljava/util/Collection;)V
at com.common.kafka.init(Kafkafunction.java:150)
at com.client.Client.main(Client.java:100)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:542)
16/10/24 02:31:08 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: java.lang.NoSuchMethodError: org.apache.kafka.clients.consumer.KafkaConsumer.subscribe(Ljava/util/Collection;)V)
Updating all the stuff up to 0.10 solves the problem. These versions are definitely not compatible in this line org.apache.kafka.clients.consumer.KafkaConsumer.subscribe(Ljava/util/Collection;)V

Apache kafka cluster using MapR Spark streaming not working

We are facing a issue while connecting to Apache kafka cluster using MapR Spark streaming (1.6.1). The setup details are as below:
• MapR cluster with Spark 1.6.1 (3 node cluster)
• Apache Kafka cluster v0.8.1.1 (5 node cluster)
We are using ‘spark-streaming-kafka’ library from mapr v1.6.1-ampr-1605. We also tried to run in local mode with apache spark (not mapr spark) this is working very well.
Below is the stack trace of the error:
Exception in thread "main" org.apache.kafka.common.config.ConfigException: No bootstrap urls given in bootstrap.servers
at org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:57)
at org.apache.kafka.clients.consumer.KafkaConsumer.initializeConsumer(KafkaConsumer.java:606)
at org.apache.kafka.clients.consumer.KafkaConsumer.partitionsFor(KafkaConsumer.java:1563)
at org.apache.spark.streaming.kafka.v09.KafkaCluster$$anonfun$getPartitions$1$$anonfun$1.apply(KafkaCluster.scala:54)
at org.apache.spark.streaming.kafka.v09.KafkaCluster$$anonfun$getPartitions$1$$anonfun$1.apply(KafkaCluster.scala:54)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
at scala.collection.immutable.Set$Set1.foreach(Set.scala:74)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105)
at org.apache.spark.streaming.kafka.v09.KafkaCluster$$anonfun$getPartitions$1.apply(KafkaCluster.scala:53)
at org.apache.spark.streaming.kafka.v09.KafkaCluster$$anonfun$getPartitions$1.apply(KafkaCluster.scala:52)
at org.apache.spark.streaming.kafka.v09.KafkaCluster.withConsumer(KafkaCluster.scala:164)
at org.apache.spark.streaming.kafka.v09.KafkaCluster.getPartitions(KafkaCluster.scala:52)
at org.apache.spark.streaming.kafka.v09.KafkaUtils$.getFromOffsets(KafkaUtils.scala:421)
at org.apache.spark.streaming.kafka.v09.KafkaUtils$.createDirectStream(KafkaUtils.scala:292)
at org.apache.spark.streaming.kafka.v09.KafkaUtils$.createDirectStream(KafkaUtils.scala:397)
at org.apache.spark.streaming.kafka.v09.KafkaUtils.createDirectStream(KafkaUtils.scala)
at com.cisco.it.log.KafkaDirectStreamin2.main(KafkaDirectStreamin2.java:111)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:742)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
PS: we are passing “metadata.broker.list” while creating connection.
Spark streaming application is not able to connect to ZK and not able to get bootstrap URL. This is what my understanding. Or it could be issue of not having correct version of map-r and kafka jar. We took jar from Map-r side but still not working.
We are able to test with apache spark successfully but not able to get it working on mapr.
Any help appericated.
In your stacktrace there are references to org.apache.spark.streaming.kafka.v09 which probably means that it's an implementation using the new consumer API which became available with Kafka 0.9 and won't work with Kafka 0.8.1.1. You should probably try one of the libraries from MapR's spark-streaming-kafka_2.10 instead.