Failed to construct kafka consumer - apache-kafka

There are quite a few answers on this topic but nothing was working.
I am trying to execute the following streams processor.
object simplestream extends App {
val builder: KStreamBuilder = new KStreamBuilder
val streamingConfig = { //ToDo - Move these to config
val settings = new Properties
settings.put(StreamsConfig.APPLICATION_ID_CONFIG, "example11")
settings.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092")
// Specify default (de)serializers for record keys and for record values.
settings.put(StreamsConfig.KEY_SERDE_CLASS_CONFIG, Serdes.String.getClass.getName)
settings.put(StreamsConfig.VALUE_SERDE_CLASS_CONFIG, Serdes.ByteArray.getClass.getName)
settings
}
val users = builder.stream("tt2")
users.print()
val stream: KafkaStreams = new KafkaStreams(builder, streamingConfig)
stream.start()
}
}
Dependencies:
//kafka
"org.apache.kafka" % "kafka-streams" % "0.10.2.0",
"org.apache.kafka" % "kafka-clients" % "0.10.2.0"
And the error:
[error] (run-main-1) org.apache.kafka.common.KafkaException: Failed to construct kafka consumer
org.apache.kafka.common.KafkaException: Failed to construct kafka consumer
at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:717)
at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:566)
at org.apache.kafka.streams.processor.internals.DefaultKafkaClientSupplier.getConsumer(DefaultKafkaClientSupplier.java:38)
at org.apache.kafka.streams.processor.internals.StreamThread.<init>(StreamThread.java:323)
at org.apache.kafka.streams.KafkaStreams.<init>(KafkaStreams.java:349)
at org.apache.kafka.streams.KafkaStreams.<init>(KafkaStreams.java:272)
at kafka.simplestream$.runStream(simplestream.scala:36)
at kafka.simplestream$.delayedEndpoint$kafka$simplestream$1(simplestream.scala:40)
at kafka.simplestream$delayedInit$body.apply(simplestream.scala:12)
at scala.Function0.apply$mcV$sp(Function0.scala:34)
at scala.Function0.apply$mcV$sp$(Function0.scala:34)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App.$anonfun$main$1$adapted(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:378)
at scala.App.main(App.scala:76)
at scala.App.main$(App.scala:74)
at kafka.simplestream$.main(simplestream.scala:12)
at kafka.simplestream.main(simplestream.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
Caused by: java.lang.NoSuchMethodError: org.apache.kafka.clients.Metadata.update(Lorg/apache/kafka/common/Cluster;J)V
I've tried different client versions, no luck. I am using kafka 0.10.2.0 version. I also get below error in zookeeper.
[2017-08-18 13:08:10,260] INFO Got user-level KeeperException when processing sessionid:0x15df53e101e0001 type:delete cxid:0x29 zxid:0x4d txntype:-1 reqpath:n/a Error Path:/admin/preferred_replica_election Error:KeeperErrorCode = NoNode for /admin/preferred_replica_election (org.apache.zookeeper.server.PrepRequestProcessor)
[2017-08-18 13:08:10,364] INFO Got user-level KeeperException when processing sessionid:0x15df53e101e0001 type:create cxid:0x35 zxid:0x4e txntype:-1 reqpath:n/a Error Path:/brokers Error:KeeperErrorCode = NodeExists for /brokers (org.apache.zookeeper.server.PrepRequestProcessor)
[2017-08-18 13:08:10,364] INFO Got user-level KeeperException when processing sessionid:0x15df53e101e0001 type:create cxid:0x36 zxid:0x4f txntype:-1 reqpath:n/a Error Path:/brokers/ids Error:KeeperErrorCode = NodeExists for /brokers/ids (org.apache.zookeeper.server.PrepRequestProcessor)
Not sure what is exactly causing it. I am able to consumer/produce just fine though.

java.lang.NoSuchMethodError - This error happens when multiple versions of the client jar is available in your classpath. Check your classpath once.
The KeeperException thrown by the zookeeper is not an issue, it's just creating the nodes / folders which doesn't exists in the Zookeeper.

Related

java.lang.NoSuchMethodError: org.apache.kafka.common.protocol.Readable.readArray([B)V

We started getting below exception while we upgraded spring-kafka to 2.8.9 and kafka-clients to 3.0.1. Please suggest.
laris-default-group-id] Request joining group due to: consumer pro-actively leaving the group
2022-11-04--16-41-17-047 [T: U: D: Tx:/ URI: M:] [org.springframework.kafka.KafkaListenerEndpointContainer#9-0-C-1] ERROR org.springframework.kafka.listener.KafkaMessageListenerContainer - Stopping container due to an Error
java.lang.NoSuchMethodError: org.apache.kafka.common.protocol.Readable.readArray([B)V
at org.apache.kafka.common.message.SyncGroupResponseData.read(SyncGroupResponseData.java:173)
at org.apache.kafka.common.message.SyncGroupResponseData.<init>(SyncGroupResponseData.java:102)
at org.apache.kafka.common.requests.SyncGroupResponse.parse(SyncGroupResponse.java:61)
at org.apache.kafka.common.requests.AbstractResponse.parseResponse(AbstractResponse.java:135)
at org.apache.kafka.common.requests.AbstractResponse.parseResponse(AbstractResponse.java:109)
at org.apache.kafka.clients.NetworkClient.parseResponse(NetworkClient.java:720)
at org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:865)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:560)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:265)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:236)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:215)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:427)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:366)
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:511)
at org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1262)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1233)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1166)
at brave.kafka.clients.TracingConsumer.poll(TracingConsumer.java:93)
at brave.kafka.clients.TracingConsumer.poll(TracingConsumer.java:87)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.pollConsumer(KafkaMessageListenerContainer.java:1529)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.doPoll(KafkaMessageListenerContainer.java:1519)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.pollAndInvoke(KafkaMessageListenerContainer.java:1343)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.run(KafkaMessageListenerContainer.java:1255)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)```

Kafka java.io.EOFException - NetworkReceive.readFromReadableChannel

I'm trying to connect to IBM Message Hub from Apache Spark 2.2.1 Structured Streaming.
The connection code is quite basic:
import org.apache.spark.sql.functions._
import org.apache.spark.sql.SparkSession
val spark = SparkSession.builder.appName("StreamingRetailTransactions").getOrCreate()
import spark.implicits._
val df = spark.readStream.
format("kafka").
option("kafka.bootstrap.servers", "kafka03-prod02.messagehub.services.eu-gb.bluemix.net:9093,kafka04-prod02.messagehub.services.eu-gb.bluemix.net:9093,kafka01-prod02.messagehub.services.eu-gb.bluemix.net:9093,kafka02-prod02.messagehub.services.eu-gb.bluemix.net:9093,kafka05-prod02.messagehub.services.eu-gb.bluemix.net:9093").
option("subscribe", "transactions_load").
option("security.protocol", "SASL_SSL").
option("sasl.mechanism", "PLAIN").
option("sasl.jaas.config", "org.apache.kafka.common.security.plain.PlainLoginModule required username=\"*****\" password=\"*****\";").
option("ssl.protocol", "TLSv1.2").
option("ssl.enabled.protocols", "TLSv1.2").
option("ssl.endpoint.identification.algorithm", "HTTPS").
option("auto.offset.reset","earliest").
option("group.id", System.currentTimeMillis).
load()
val query = df.writeStream.format("console").start()
I'm starting the spark shell with:
~/spark-2.2.1-bin-hadoop2.7$ ./bin/spark-shell --packages org.apache.spark:spark-sql-kafka-0-10_2.11:2.2.0
However, I'm getting a disconnected error:
scala> 18/01/09 08:34:17 WARN NetworkClient: Bootstrap broker kafka02-prod02.messagehub.services.eu-gb.bluemix.net:9093 disconnected
18/01/09 08:34:17 WARN NetworkClient: Bootstrap broker kafka03-prod02.messagehub.services.eu-gb.bluemix.net:9093 disconnected
18/01/09 08:34:17 WARN NetworkClient: Bootstrap broker kafka01-prod02.messagehub.services.eu-gb.bluemix.net:9093 disconnected
<<repeats forever>>
I've increased debug output with sc.setLogLevel("DEBUG") and I get:
<<log output ommitted for brevity>>
18/01/09 08:38:28 DEBUG SessionState: SessionState user: null
18/01/09 08:38:28 DEBUG SessionState: HDFS root scratch dir: /tmp/hive with schema null, permission: rwx-wx-wx
18/01/09 08:38:28 INFO SessionState: Created local directory: /tmp/2b4557ce-dd17-46d1-9ab0-f9a36fd750f9_resources
18/01/09 08:38:28 INFO SessionState: Created HDFS directory: /tmp/hive/snowch/2b4557ce-dd17-46d1-9ab0-f9a36fd750f9
18/01/09 08:38:28 INFO SessionState: Created local directory: /tmp/snowch/2b4557ce-dd17-46d1-9ab0-f9a36fd750f9
18/01/09 08:38:28 INFO SessionState: Created HDFS directory: /tmp/hive/snowch/2b4557ce-dd17-46d1-9ab0-f9a36fd750f9/_tmp_space.db
18/01/09 08:38:28 INFO HiveClientImpl: Warehouse location for Hive client (version 1.2.1) is file:/home/snowch/spark-2.2.1-bin-hadoop2.7/spark-warehouse
18/01/09 08:38:28 DEBUG SessionState: Session is using authorization class class org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvider
18/01/09 08:38:28 DEBUG StateStoreCoordinatorRef: Retrieved existing StateStoreCoordinator endpoint
18/01/09 08:38:28 DEBUG StreamExecution: Starting Trigger Calculation
18/01/09 08:38:28 INFO StreamExecution: Starting new streaming query.
18/01/09 08:38:28 DEBUG UserGroupInformation: PrivilegedAction as:snowch (auth:SIMPLE) from:org.apache.hadoop.fs.FileContext.getAbstractFileSystem(FileContext.java:331)
18/01/09 08:38:28 DEBUG KafkaSource$$anon$1: Unable to find batch /tmp/temporary-fb3e6e1a-fbbe-4098-991c-0b29f63ecade/sources/0/0
18/01/09 08:38:28 DEBUG AbstractCoordinator: Sending coordinator request for group spark-kafka-source-9a14bb54-8f1b-47db-8497-19c083128496--998588290-driver-0 to broker kafka05-prod02.messagehub.services.eu-gb.bluemix.net:9093 (id: -5 rack: null)
18/01/09 08:38:28 DEBUG NetworkClient: Initiating connection to node -5 at kafka05-prod02.messagehub.services.eu-gb.bluemix.net:9093.
18/01/09 08:38:28 DEBUG NetworkClient: Initialize connection to node -3 for sending metadata request
18/01/09 08:38:28 DEBUG NetworkClient: Initiating connection to node -3 at kafka01-prod02.messagehub.services.eu-gb.bluemix.net:9093.
18/01/09 08:38:28 DEBUG Metrics: Added sensor with name node--5.bytes-sent
18/01/09 08:38:28 DEBUG Metrics: Added sensor with name node--5.bytes-received
18/01/09 08:38:28 DEBUG Metrics: Added sensor with name node--5.latency
18/01/09 08:38:28 DEBUG NetworkClient: Completed connection to node -5
18/01/09 08:38:28 DEBUG Metrics: Added sensor with name node--3.bytes-sent
18/01/09 08:38:28 DEBUG Metrics: Added sensor with name node--3.bytes-received
18/01/09 08:38:28 DEBUG Metrics: Added sensor with name node--3.latency
18/01/09 08:38:28 DEBUG NetworkClient: Completed connection to node -3
18/01/09 08:38:28 DEBUG NetworkClient: Sending metadata request {topics=[transactions_load]} to node -5
18/01/09 08:38:28 DEBUG Selector: Connection with kafka05-prod02.messagehub.services.eu-gb.bluemix.net/159.8.179.153 disconnected
java.io.EOFException
at org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:83)
at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:71)
at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:154)
at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:135)
at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:323)
at org.apache.kafka.common.network.Selector.poll(Selector.java:283)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:260)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:360)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:192)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:163)
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:179)
at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:974)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:938)
at org.apache.spark.sql.kafka010.KafkaOffsetReader$$anonfun$fetchLatestOffsets$1$$anonfun$apply$9.apply(KafkaOffsetReader.scala:174)
at org.apache.spark.sql.kafka010.KafkaOffsetReader$$anonfun$fetchLatestOffsets$1$$anonfun$apply$9.apply(KafkaOffsetReader.scala:172)
at org.apache.spark.sql.kafka010.KafkaOffsetReader$$anonfun$org$apache$spark$sql$kafka010$KafkaOffsetReader$$withRetriesWithoutInterrupt$1.apply$mcV$sp(KafkaOffsetReader.scala:263)
at org.apache.spark.sql.kafka010.KafkaOffsetReader$$anonfun$org$apache$spark$sql$kafka010$KafkaOffsetReader$$withRetriesWithoutInterrupt$1.apply(KafkaOffsetReader.scala:262)
at org.apache.spark.sql.kafka010.KafkaOffsetReader$$anonfun$org$apache$spark$sql$kafka010$KafkaOffsetReader$$withRetriesWithoutInterrupt$1.apply(KafkaOffsetReader.scala:262)
at org.apache.spark.util.UninterruptibleThread.runUninterruptibly(UninterruptibleThread.scala:85)
at org.apache.spark.sql.kafka010.KafkaOffsetReader.org$apache$spark$sql$kafka010$KafkaOffsetReader$$withRetriesWithoutInterrupt(KafkaOffsetReader.scala:261)
at org.apache.spark.sql.kafka010.KafkaOffsetReader$$anonfun$fetchLatestOffsets$1.apply(KafkaOffsetReader.scala:172)
at org.apache.spark.sql.kafka010.KafkaOffsetReader$$anonfun$fetchLatestOffsets$1.apply(KafkaOffsetReader.scala:172)
at org.apache.spark.sql.kafka010.KafkaOffsetReader.runUninterruptibly(KafkaOffsetReader.scala:230)
at org.apache.spark.sql.kafka010.KafkaOffsetReader.fetchLatestOffsets(KafkaOffsetReader.scala:171)
at org.apache.spark.sql.kafka010.KafkaSource$$anonfun$initialPartitionOffsets$1.apply(KafkaSource.scala:132)
at org.apache.spark.sql.kafka010.KafkaSource$$anonfun$initialPartitionOffsets$1.apply(KafkaSource.scala:129)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.kafka010.KafkaSource.initialPartitionOffsets$lzycompute(KafkaSource.scala:129)
at org.apache.spark.sql.kafka010.KafkaSource.initialPartitionOffsets(KafkaSource.scala:97)
at org.apache.spark.sql.kafka010.KafkaSource.getOffset(KafkaSource.scala:163)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$10$$anonfun$apply$6.apply(StreamExecution.scala:521)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$10$$anonfun$apply$6.apply(StreamExecution.scala:521)
at org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:279)
at org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$10.apply(StreamExecution.scala:520)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$10.apply(StreamExecution.scala:518)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.AbstractTraversable.map(Traversable.scala:104)
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$constructNextBatch(StreamExecution.scala:518)
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$populateStartOffsets(StreamExecution.scala:492)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1$$anonfun$apply$mcZ$sp$1.apply$mcV$sp(StreamExecution.scala:297)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1$$anonfun$apply$mcZ$sp$1.apply(StreamExecution.scala:294)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1$$anonfun$apply$mcZ$sp$1.apply(StreamExecution.scala:294)
at org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:279)
at org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1.apply$mcZ$sp(StreamExecution.scala:294)
at org.apache.spark.sql.execution.streaming.ProcessingTimeExecutor.execute(TriggerExecutor.scala:56)
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches(StreamExecution.scala:290)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:206)
18/01/09 08:38:28 DEBUG NetworkClient: Node -5 disconnected.
18/01/09 08:38:28 WARN NetworkClient: Bootstrap broker kafka05-prod02.messagehub.services.eu-gb.bluemix.net:9093 disconnected
18/01/09 08:38:28 DEBUG ConsumerNetworkClient: Cancelled GROUP_COORDINATOR request ClientRequest(expectResponse=true, callback=org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler#4eacac4e, request=RequestSend(header={api_key=10,api_version=0,correlation_id=0,client_id=consumer-1}, body={group_id=spark-kafka-source-9a14bb54-8f1b-47db-8497-19c083128496--998588290-driver-0}), createdTimeMs=1515487108479, sendTimeMs=1515487108598) with correlation id 0 due to node -5 being disconnected
18/01/09 08:38:28 DEBUG NetworkClient: Sending metadata request {topics=[transactions_load]} to node -3
18/01/09 08:38:28 DEBUG Selector: Connection with kafka01-prod02.messagehub.services.eu-gb.bluemix.net/159.8.179.149 disconnected
java.io.EOFException
at org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:83)
at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:71)
<<repeated>>
I have seen the following similar questions:
Kafka Error in I/O java.io.EOFException: null - however, this question is for a much older version of Kafka
I understand that some of the output is just 'noise', however, my spark streaming application does not appear to be receiving any data. I have connected with a console client and I am able to see data.
Update 1 - I've tried configuring JaaS, but still getting the same error. The issue may be that the JaaS code needs to run on each worker node, but isn't getting run on them.
sc.setLogLevel("DEBUG")
def jaasClientConfig(username: String, password: String): Unit = {
import javax.security.auth.login.AppConfigurationEntry
import javax.security.auth.login.Configuration
import javax.security.auth.login.LoginException
import scala.collection.JavaConversions._
System.setProperty("java.security.auth.login.config", "")
Configuration.setConfiguration(new Configuration() {
def getAppConfigurationEntry(name: String): Array[AppConfigurationEntry] = {
val idMap = Map(
"serviceName" -> "kafka",
"username" -> username,
"password" -> password
)
val ace = new AppConfigurationEntry(
"org.apache.kafka.common.security.plain.PlainLoginModule",
AppConfigurationEntry.LoginModuleControlFlag.REQUIRED,
idMap
)
return Array(ace)
}
})
}
def startSparkStreaming(): Unit = {
import org.apache.spark.sql.functions._
import org.apache.spark.sql.SparkSession
val spark = SparkSession.builder.appName("StreamingRetailTransactions").getOrCreate()
import spark.implicits._
val df = spark.readStream.
format("kafka").
option("kafka.bootstrap.servers", "kafka03-prod02.messagehub.services.eu-gb.bluemix.net:9093,kafka04-prod02.messagehub.services.eu-gb.bluemix.net:9093,kafka01-prod02.messagehub.services.eu-gb.bluemix.net:9093,kafka02-prod02.messagehub.services.eu-gb.bluemix.net:9093,kafka05-prod02.messagehub.services.eu-gb.bluemix.net:9093").
option("subscribe", "transactions_load").
option("security.protocol", "SASL_SSL").
option("sasl.mechanism", "PLAIN").
option("ssl.protocol", "TLSv1.2").
option("ssl.enabled.protocols", "TLSv1.2").
option("ssl.endpoint.identification.algorithm", "HTTPS").
option("auto.offset.reset","earliest").
option("group.id", System.currentTimeMillis).
load()
val query = df.writeStream.format("console").start()
}
jaasClientConfig("****","****")
startSparkStreaming()
Update 2
I've also tried with a jaas.conf:
~/spark-2.2.1-bin-hadoop2.7$ ./bin/spark-shell --packages org.apache.spark:spark-sql-kafka-0-10_2.11:2.2.0 --conf "spark.executor.extraJavaOptions=-Djava.security.auth.login.conf=jaas.conf" --files "jaas.conf"
and ...
KafkaClient {
org.apache.kafka.common.security.plain.PlainLoginModule required
username="*****"
password="*****";
}
Still the same problem ...
First I needed to run spark-shell with --conf that points the executor and driver to my jaas.conf:
./bin/spark-shell --master local[1] \
--jars external/kafka-0-10-sql/target/spark-sql-kafka-0-10_2.11-2.2.2-SNAPSHOT.jar,external/kafka-0-10-assembly/target/spark-streaming-kafka-0-10-assembly_2.11-2.2.2-SNAPSHOT.jar \
--conf "spark.driver.extraJavaOptions=-Djava.security.auth.login.config=/path/to/jaas.conf" \
--conf "spark.executor.extraJavaOptions=-Djava.security.auth.login.config=/path/to/jaas.conf" \
--num-executors 1 --executor-cores 1
Next I had to add some kafka options:
val df = spark.readStream.
format("kafka").
option("kafka.bootstrap.servers", "kafka03-prod02.messagehub.services.eu-gb.bluemix.net:9093,kafka04-prod02.messagehub.services.eu-gb.bluemix.net:9093,kafka01-prod02.messagehub.services.eu-gb.bluemix.net:9093,kafka02-prod02.messagehub.services.eu-gb.bluemix.net:9093,kafka05-prod02.messagehub.services.eu-gb.bluemix.net:9093").
option("subscribe", "transactions_load").
option("kafka.security.protocol", "SASL_SSL").
option("kafka.sasl.mechanism", "PLAIN").
option("kafka.ssl.protocol", "TLSv1.2").
option("kafka.ssl.enabled.protocols", "TLSv1.2").
load()
Note that the kafka options need to be prefixed with kafka., for example:
security.protocol => kafka.security.protocol
These changes solved the connectivity issue for me.
This appears to be a failed authentication. At the current MH version of Kafka the server just closes the connection when authentication fails
the "sasl.jaas.config" setting is used in the Kafka client from version 0.10.2
the Kafka client used by Spark 2.2.1 is 0.10.0 so authentication fails as suspected.
you can use the java.security.auth.login.config system property to specify a jaas file
Alternatively you can programmatically set the credentials for the client with a snippet like this
public static void jaasClientConfig(final String username, final String password) throws Exception {
System.setProperty("java.security.auth.login.config", "");
Configuration.setConfiguration(new Configuration() {
public AppConfigurationEntry[] getAppConfigurationEntry(String name) {
HashMap<String, String> idMap = new HashMap<>();
idMap.put("serviceName", "kafka"); // Seems to be optional
idMap.put("username", username);
idMap.put("password", password);
AppConfigurationEntry ace = new AppConfigurationEntry("org.apache.kafka.common.security.plain.PlainLoginModule",
AppConfigurationEntry.LoginModuleControlFlag.REQUIRED, idMap);
AppConfigurationEntry[] entry = { ace };
return entry;
}
});
}

KafkaSpout tuple replay throws null pointer exception

I am using storm 1.0.1 and Kafka 0.10.0.0 with storm-kafka-client 1.0.3.
please find the code config I have below.
kafkaConsumerProps.put(KafkaSpoutConfig.Consumer.KEY_DESERIALIZER, "org.apache.kafka.common.serialization.ByteArrayDeserializer");
kafkaConsumerProps.put(KafkaSpoutConfig.Consumer.VALUE_DESERIALIZER, "org.apache.kafka.common.serialization.ByteArrayDeserializer");
KafkaSpoutStreams kafkaSpoutStreams = new KafkaSpoutStreamsNamedTopics.Builder(new Fields(fieldNames), topics)
.build();
KafkaSpoutRetryService retryService = new KafkaSpoutRetryExponentialBackoff(TimeInterval.microSeconds(500),
TimeInterval.milliSeconds(2), Integer.MAX_VALUE, TimeInterval.seconds(10));
KafkaSpoutTuplesBuilder tuplesBuilder = new KafkaSpoutTuplesBuilderNamedTopics.Builder(new TestTupleBuilder(topics))
.build();
KafkaSpoutConfig kafkaSpoutConfig = new KafkaSpoutConfig.Builder<String, String>(kafkaConsumerProps, kafkaSpoutStreams, tuplesBuilder, retryService)
.setOffsetCommitPeriodMs(10_000)
.setFirstPollOffsetStrategy(LATEST)
.setMaxRetries(5)
.setMaxUncommittedOffsets(250)
.build();
When I fail the tuple its not getting replayed. Spout throws below error.
Please let me know why it's throwing nullpointer exception.
53501 [Thread-359-test-spout-executor[295 295]] ERROR o.a.s.util - Async loop died!
java.lang.NullPointerException
at org.apache.storm.kafka.spout.KafkaSpout.doSeekRetriableTopicPartitions(KafkaSpout.java:260) ~[storm-kafka-client-1.0.3.jar:1.0.3]
at org.apache.storm.kafka.spout.KafkaSpout.pollKafkaBroker(KafkaSpout.java:248) ~[storm-kafka-client-1.0.3.jar:1.0.3]
at org.apache.storm.kafka.spout.KafkaSpout.nextTuple(KafkaSpout.java:203) ~[storm-kafka-client-1.0.3.jar:1.0.3]
at org.apache.storm.daemon.executor$fn__7885$fn__7900$fn__7931.invoke(executor.clj:645) ~[storm-core-1.0.1.jar:1.0.1]
at org.apache.storm.util$async_loop$fn__625.invoke(util.clj:484) [storm-core-1.0.1.jar:1.0.1]
at clojure.lang.AFn.run(AFn.java:22) [clojure-1.8.0.jar:?]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_102]
53501 [Thread-359-test-spout-executor[295 295]] ERROR o.a.s.d.executor -
java.lang.NullPointerException
at org.apache.storm.kafka.spout.KafkaSpout.doSeekRetriableTopicPartitions(KafkaSpout.java:260) ~[storm-kafka-client-1.0.3.jar:1.0.3]
at org.apache.storm.kafka.spout.KafkaSpout.pollKafkaBroker(KafkaSpout.java:248) ~[storm-kafka-client-1.0.3.jar:1.0.3]
at org.apache.storm.kafka.spout.KafkaSpout.nextTuple(KafkaSpout.java:203) ~[storm-kafka-client-1.0.3.jar:1.0.3]
at org.apache.storm.daemon.executor$fn__7885$fn__7900$fn__7931.invoke(executor.clj:645) ~[storm-core-1.0.1.jar:1.0.1]
at org.apache.storm.util$async_loop$fn__625.invoke(util.clj:484) [storm-core-1.0.1.jar:1.0.1]
at clojure.lang.AFn.run(AFn.java:22) [clojure-1.8.0.jar:?]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_102]
53527 [Thread-359-test-spout-executor[295 295]] ERROR o.a.s.util - Halting process: ("Worker died")
java.lang.RuntimeException: ("Worker died")
at org.apache.storm.util$exit_process_BANG_.doInvoke(util.clj:341) [storm-core-1.0.1.jar:1.0.1]
at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.8.0.jar:?]
at org.apache.storm.daemon.worker$fn__8554$fn__8555.invoke(worker.clj:761) [storm-core-1.0.1.jar:1.0.1]
at org.apache.storm.daemon.executor$mk_executor_data$fn__7773$fn__7774.invoke(executor.clj:271) [storm-core-1.0.1.jar:1.0.1]
at org.apache.storm.util$async_loop$fn__625.invoke(util.clj:494) [storm-core-1.0.1.jar:1.0.1]
at clojure.lang.AFn.run(AFn.java:22) [clojure-1.8.0.jar:?]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_102]
Please find the complete spout configs below
{key.deserializer=org.apache.kafka.common.serialization.ByteArrayDeserializer, value.deserializer=org.apache.kafka.common.serialization.ByteArrayDeserializer, group.id=test-group, ssl.keystore.location=C:/test.jks, bootstrap.servers=localhost:1000, auto.commit.interval.ms=1000, security.protocol=SSL, enable.auto.commit=true, ssl.truststore.location=C:/test1.jks, ssl.keystore.password=pass123, ssl.key.password=pass123, ssl.truststore.password=pass123, session.timeout.ms=30000, auto.offset.reset=latest}
Storm 1.0.1 consists of storm-kafka-client in beta quality. We have fixed few issues and more stable version is available in Storm 1.1 release and can be used against Kafka 0.10 onwards.
In your topology you can make dependency against storm-kafka-client version 1.1 and kafka-clients dependency with appropriate version. You don't need to upgrade storm cluster itself.
I had enable.auto.commit=true making the value as false resolved the issue for me.

Apache Ignite Kafka connection issues

I'm trying to do stream processing and CEP on a Kafka message stream. For this I picked Apache Ignite to realise a prototype first. However I cannot connect to the queue:
Use
kafka_2.11-0.10.1.0
apache-ignite-fabric-1.8.0-bin
bin/zookeeper-server-start.sh config/zookeeper.properties
bin/kafka-server-start.sh config/server.properties
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test
Kafka works properly, I tested it with a consumer.
Then I start ignite, then I run following in a spring boot commandline app.
KafkaStreamer<String, String, String> kafkaStreamer = new KafkaStreamer<>();
Ignition.setClientMode(true);
Ignite ignite = Ignition.start();
Properties settings = new Properties();
// Set a few key parameters
settings.put("bootstrap.servers", "localhost:9092");
settings.put("group.id", "test");
settings.put("zookeeper.connect", "localhost:2181");
settings.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
settings.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
settings.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
settings.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
// Create an instance of StreamsConfig from the Properties instance
kafka.consumer.ConsumerConfig config = new ConsumerConfig(settings);
IgniteCache<String, String> cache = ignite.getOrCreateCache("myCache");
try (IgniteDataStreamer<String, String> stmr = ignite.dataStreamer("myCache")) {
// allow overwriting cache data
stmr.allowOverwrite(true);
kafkaStreamer.setIgnite(ignite);
kafkaStreamer.setStreamer(stmr);
// set the topic
kafkaStreamer.setTopic("test");
// set the number of threads to process Kafka streams
kafkaStreamer.setThreads(1);
// set Kafka consumer configurations
kafkaStreamer.setConsumerConfig(config);
// set decoders
StringDecoder keyDecoder = new StringDecoder(null);
StringDecoder valueDecoder = new StringDecoder(null);
kafkaStreamer.setKeyDecoder(keyDecoder);
kafkaStreamer.setValueDecoder(valueDecoder);
kafkaStreamer.start();
} finally {
kafkaStreamer.stop();
}
When the application starts I get
2017-02-23 10:25:23.409 WARN 1388 --- [ main] kafka.utils.VerifiableProperties : Property bootstrap.servers is not valid
2017-02-23 10:25:23.410 INFO 1388 --- [ main] kafka.utils.VerifiableProperties : Property group.id is overridden to test
2017-02-23 10:25:23.410 WARN 1388 --- [ main] kafka.utils.VerifiableProperties : Property key.deserializer is not valid
2017-02-23 10:25:23.411 WARN 1388 --- [ main] kafka.utils.VerifiableProperties : Property key.serializer is not valid
2017-02-23 10:25:23.411 WARN 1388 --- [ main] kafka.utils.VerifiableProperties : Property value.deserializer is not valid
2017-02-23 10:25:23.411 WARN 1388 --- [ main] kafka.utils.VerifiableProperties : Property value.serializer is not valid
2017-02-23 10:25:23.411 INFO 1388 --- [ main] kafka.utils.VerifiableProperties : Property zookeeper.connect is overridden to localhost:2181
Then
2017-02-23 10:25:24.057 WARN 1388 --- [r-finder-thread] kafka.client.ClientUtils$ : Fetching topic metadata with correlation id 0 for topics [Set(test)] from broker [BrokerEndPoint(0,user.local,9092)] failed
java.nio.channels.ClosedChannelException: null
at kafka.network.BlockingChannel.send(BlockingChannel.scala:110) ~[kafka_2.11-0.10.0.1.jar:na]
at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:80) ~[kafka_2.11-0.10.0.1.jar:na]
at kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:79) ~[kafka_2.11-0.10.0.1.jar:na]
at kafka.producer.SyncProducer.send(SyncProducer.scala:124) ~[kafka_2.11-0.10.0.1.jar:na]
at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:59) [kafka_2.11-0.10.0.1.jar:na]
at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:94) [kafka_2.11-0.10.0.1.jar:na]
at kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:66) [kafka_2.11-0.10.0.1.jar:na]
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63) [kafka_2.11-0.10.0.1.jar:na]
And reading from the queue doesn't work.
Does anyone have an idea how to fix this?
Edit: If I comment the contents of the finally block then following error comes
[2m2017-02-27 16:42:27.780[0;39m [31mERROR[0;39m [35m29946[0;39m [2m---[0;39m [2m[pool-3-thread-1][0;39m [36m [0;39m [2m:[0;39m Message is ignored due to an error [msg=MessageAndMetadata(test,0,Message(magic = 1, attributes = 0, CreateTime = -1, crc = 2558126716, key = java.nio.HeapByteBuffer[pos=0 lim=1 cap=79], payload = java.nio.HeapByteBuffer[pos=0 lim=74 cap=74]),15941704,kafka.serializer.StringDecoder#74a96647,kafka.serializer.StringDecoder#42849d34,-1,CreateTime)]
java.lang.IllegalStateException: Data streamer has been closed.
at org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl.enterBusy(DataStreamerImpl.java:401) ~[ignite-core-1.8.0.jar:1.8.0]
at org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl.addDataInternal(DataStreamerImpl.java:613) ~[ignite-core-1.8.0.jar:1.8.0]
at org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl.addData(DataStreamerImpl.java:667) ~[ignite-core-1.8.0.jar:1.8.0]
at org.apache.ignite.stream.kafka.KafkaStreamer$1.run(KafkaStreamer.java:180) ~[ignite-kafka-1.8.0.jar:1.8.0]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_111]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_111]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_111]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_111]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
Thanks!
I think this happens because KafkaStreamer is getting closed right after it's started (kafkaStreamer.stop() call in finally block). kafkaStreamer.start() is not synchronous, it just spins out threads to consume from Kafka and exits.

Kafka consumer java code not working

I have Just started with Kafka, I am able to produce and consume data through command prompt and also produce data through Java code even from remote server.
But I am trying this simple consumer Java code it is not working.
public class Simpleconsumer {
private final ConsumerConnector consumer;
private final String topic;
public Simpleconsumer(String topic) {
Properties props = new Properties();
props.put("zookeeper.connect", "127.0.0.1:2181");
props.put("group.id", "topic1");
props.put("auto.offset.reset", "smallest");
consumer = Consumer.createJavaConsumerConnector(new ConsumerConfig(props));
this.topic = topic;
}
public void testConsumer() {
try{
Map<String, Integer> topicCount = new HashMap<String, Integer>();
topicCount.put(topic, 1);
Map<String, List<KafkaStream<byte[], byte[]>>> consumerStreams = consumer.createMessageStreams(topicCount);
List<KafkaStream<byte[], byte[]>> streams = consumerStreams.get(topic);
System.out.println("start.......");
for (final KafkaStream stream : streams) {
ConsumerIterator<byte[], byte[]> it = stream.iterator();
System.out.println("iterate.......");
while (it.hasNext()) {
System.out.println("Message from Single Topic: " + new String(it.next().message()));
}
}
System.out.println("end.......");
if (consumer != null) {
consumer.shutdown();
}
}
catch(Exception e)
{
System.out.println(e);
}
}
public static void main(String[] args) {
// String topic = args[0];
Simpleconsumer simpleHLConsumer = new Simpleconsumer("topic1");
simpleHLConsumer.testConsumer();
}
}
Output is :-
log4j:WARN No appenders could be found for logger (kafka.utils.VerifiableProperties).
log4j:WARN Please initialize the log4j system properly.
start.......
iterate.......
There is no error, the programme doesn't terminate or give any output!!
Log of zookeeper
2016-02-18 17:31:31,790 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory#197] - Accepted socket connection from /127.0.0.1:33338
2016-02-18 17:31:31,793 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer#868] - Client attempting to establish new session at /127.0.0.1:33338
2016-02-18 17:31:31,821 [myid:] - INFO [SyncThread:0:ZooKeeperServer#617] - Established session 0x152f4265b0b0009 with negotiated timeout 6000 for client /127.0.0.1:33338
2016-02-18 17:31:31,891 [myid:] - INFO [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor#645] - Got user-level KeeperException when processing sessionid:0x152f4265b0b0009 type:create cxid:0x1 zxid:0x718 txntype:-1 reqpath:n/a Error Path:/consumers Error:KeeperErrorCode = NodeExists for /consumers
2016-02-18 17:31:31,892 [myid:] - INFO [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor#645] - Got user-level KeeperException when processing sessionid:0x152f4265b0b0009 type:create cxid:0x2 zxid:0x719 txntype:-1 reqpath:n/a Error Path:/consumers/artinew Error:KeeperErrorCode = NodeExists for /consumers/artinew
2016-02-18 17:31:31,892 [myid:] - INFO [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor#645] - Got user-level KeeperException when processing sessionid:0x152f4265b0b0009 type:create cxid:0x3 zxid:0x71a txntype:-1 reqpath:n/a Error Path:/consumers/artinew/ids Error:KeeperErrorCode = NodeExists for /consumers/artinew/ids
2016-02-18 17:31:32,000 [myid:] - INFO [SessionTracker:ZooKeeperServer#347] - Expiring session 0x152f4265b0b0008, timeout of 6000ms exceeded
2016-02-18 17:31:32,000 [myid:] - INFO [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor#494] - Processed session termination for sessionid: 0x152f4265b0b0008
2016-02-18 17:31:32,002 [myid:] - INFO [SyncThread:0:NIOServerCnxn#1007] - Closed socket connection for client /127.0.0.1:33337 which had sessionid 0x152f4265b0b0
I am getting this in Kafka console in infinite loop. please explain
[2016-02-17 20:50:08,594] INFO Closing socket connection to /xx.xx.xx.xx. (kafka.network.Processor)
[2016-02-17 20:50:08,174] INFO Closing socket connection to /xx.xx.xx.xx. (kafka.network.Processor)
[2016-02-17 20:50:08,385] INFO Closing socket connection to /xx.xx.xx.xx. (kafka.network.Processor)
[2016-02-17 20:50:08,760] INFO Closing socket connection to /xx.xx.xx.xx. (kafka.network.Processor)
i created the topic in the following manner
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 5 --topic topic1
i am able to consume it in command prompt using
bin/kafka-console-consumer.sh --zookeeper localhost:2181 --from-beginning --topic topic1
Not able to understand what is the issue.
Try localhost instead of 127.0.0.1 in code to make sure local resolution is working fine.