I tried a simple sample code to test access to a "kerberized" Kafka from Quarkus 2.2.2 with smallrye-reactive-messaging-kafka :
package org.acme;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.eclipse.microprofile.reactive.messaging.Incoming;
import javax.enterprise.context.ApplicationScoped;
#ApplicationScoped
public class MyTopicConsumer {
#Incoming("in")
public void consume(ConsumerRecord<String, String> record) {
System.out.println("read from Kafka : " + record.value() ) ;
}
}
Kafkas is behind Kerberos, so i used an application.properties like this :
quarkus.ssl.native=true
quarkus.native.enable-all-security-services=true
mp.messaging.incoming.in.group.id=my-group
mp.messaging.incoming.in.auto.commit.interval.ms=1000
mp.messaging.incoming.in.security.protocol=SASL_SSL
mp.messaging.incoming.in.sasl.kerberos.service.name=kafka
mp.messaging.incoming.in.sasl.mechanism=GSSAPI
mp.messaging.incoming.in.sasl.jaas.config=com.sun.security.auth.module.Krb5LoginModule "required" doNotPrompt=true useKeyTab=true storeKey=true serviceName="kafka" keyTab="<keytab>" principal="<principal>" useTicketCache=false;
mp.messaging.incoming.in.ssl.truststore.location=<location>
mp.messaging.incoming.in.ssl.truststore.password=<password>
mp.messaging.incoming.in.connector=smallrye-kafka
mp.messaging.incoming.in.topic=<topic>
mp.messaging.incoming.in.auto.offset.reset=earliest
mp.messaging.incoming.in.enable.auto.commit=false
mp.messaging.incoming.in.bootstrap.servers=<list of servers>
It works nicely in jvm mode, but fails in native mode (graalvm-ce-java11-21.2.0) with this error :
ERROR [io.sma.rea.mes.provider] (main) SRMSG00230: Unable to create the publisher or subscriber during initialization: org.apache.kafka.common.KafkaException: Failed to construct kafka consumer
at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:823)
at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:665)
at io.smallrye.reactive.messaging.kafka.impl.ReactiveKafkaConsumer.<init>(ReactiveKafkaConsumer.java:80)
at io.smallrye.reactive.messaging.kafka.impl.KafkaSource.<init>(KafkaSource.java:85)
at io.smallrye.reactive.messaging.kafka.KafkaConnector.getPublisherBuilder(KafkaConnector.java:182)
at io.smallrye.reactive.messaging.kafka.KafkaConnector_ClientProxy.getPublisherBuilder(KafkaConnector_ClientProxy.zig:159)
at io.smallrye.reactive.messaging.impl.ConfiguredChannelFactory.createPublisherBuilder(ConfiguredChannelFactory.java:190)
at io.smallrye.reactive.messaging.impl.ConfiguredChannelFactory.register(ConfiguredChannelFactory.java:153)
at io.smallrye.reactive.messaging.impl.ConfiguredChannelFactory.initialize(ConfiguredChannelFactory.java:125)
at io.smallrye.reactive.messaging.impl.ConfiguredChannelFactory_ClientProxy.initialize(ConfiguredChannelFactory_ClientProxy.zig:189)
at java.util.Iterator.forEachRemaining(Iterator.java:133)
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:658)
at io.smallrye.reactive.messaging.extension.MediatorManager.start(MediatorManager.java:189)
at io.smallrye.reactive.messaging.extension.MediatorManager_ClientProxy.start(MediatorManager_ClientProxy.zig:220)
at io.quarkus.smallrye.reactivemessaging.runtime.SmallRyeReactiveMessagingLifecycle.onApplicationStart(SmallRyeReactiveMessagingLifecycle.java:41)
at io.quarkus.smallrye.reactivemessaging.runtime.SmallRyeReactiveMessagingLifecycle_Observer_onApplicationStart_4e8937813d9e8faff65c3c07f88fa96615b70e70.notify(SmallRyeReactiveMessagingLifecycle_Observer_onApplicationStart_4e8937813d9e8faff65c3c07f88fa96615b70e70.zig:111)
at io.quarkus.arc.impl.EventImpl$Notifier.notifyObservers(EventImpl.java:300)
at io.quarkus.arc.impl.EventImpl$Notifier.notify(EventImpl.java:282)
at io.quarkus.arc.impl.EventImpl.fire(EventImpl.java:70)
at io.quarkus.arc.runtime.ArcRecorder.fireLifecycleEvent(ArcRecorder.java:128)
at io.quarkus.arc.runtime.ArcRecorder.handleLifecycleEvents(ArcRecorder.java:97)
at io.quarkus.deployment.steps.LifecycleEventsBuildStep$startupEvent1144526294.deploy_0(LifecycleEventsBuildStep$startupEvent1144526294.zig:87)
at io.quarkus.deployment.steps.LifecycleEventsBuildStep$startupEvent1144526294.deploy(LifecycleEventsBuildStep$startupEvent1144526294.zig:40)
at io.quarkus.runner.ApplicationImpl.doStart(ApplicationImpl.zig:623)
at io.quarkus.runtime.Application.start(Application.java:101)
at io.quarkus.runtime.ApplicationLifecycleManager.run(ApplicationLifecycleManager.java:101)
at io.quarkus.runtime.Quarkus.run(Quarkus.java:66)
at io.quarkus.runtime.Quarkus.run(Quarkus.java:42)
at io.quarkus.runtime.Quarkus.run(Quarkus.java:119)
at io.quarkus.runner.GeneratedMain.main(GeneratedMain.zig:29)
Caused by: org.apache.kafka.common.KafkaException: org.apache.kafka.common.KafkaException: Could not find a public no-argument constructor for org.apache.kafka.common.security.kerberos.KerberosLogin
at org.apache.kafka.common.network.SaslChannelBuilder.configure(SaslChannelBuilder.java:184)
at org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:192)
at org.apache.kafka.common.network.ChannelBuilders.clientChannelBuilder(ChannelBuilders.java:81)
at org.apache.kafka.clients.ClientUtils.createChannelBuilder(ClientUtils.java:105)
at org.apache.kafka.clients.consumer.KafkaConsumer.<init>(KafkaConsumer.java:737)
... 30 more
I tried a few changes suggested by some posts, but with no effect.
Can anyone suggest how to fix or workaround this ?
Thanks
It is seems that the code doesn't contain the constructor org.apache.kafka.common.security.kerberos.KerberosLogin.
Have you try to add the class as describe in https://quarkus.io/guides/writing-native-applications-tips#registering-for-reflection
Maybe you need to add line in your configuration file
quarkus.native.additional-build-args=-H:ReflectionConfigurationFiles=reflection-config.json
And add the class org.apache.kafka.common.security.kerberos.KerberosLogin in reflection-config.json as describe here https://quarkus.io/guides/writing-native-applications-tips#using-a-configuration-file
I have a Apache Flink Application, where I want to filter the data by Country which gets read from topic v01 and write the filtered data into the topic v02. For testing purposes I tried to write everything in uppercase.
My Code:
package org.example;
import org.apache.avro.Schema;
import org.apache.flink.api.common.functions.MapFunction;
import org.apache.flink.formats.avro.registry.confluent.ConfluentRegistryAvroDeserializationSchema;
import org.apache.flink.formats.avro.registry.confluent.ConfluentRegistryAvroSerializationSchema;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer;
import org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer;
import java.util.Properties;
public class KafkaRead {
public static void main(String[] args) throws Exception {
Properties properties = new Properties();
properties.setProperty("bootstrap.servers", "192.168.100.100:9092");
properties.setProperty("group.id", "luft");
String schemaRegistryUrl = "http://192.168.100.100:8081";
String valueSchema = "{\"connect.name\":\"com.github.jcustenborder.kafka.connect.model.Value\",\"fields\":[{\"default\":null,\"name\":\"Datum\",\"type\":[\"null\",\"string\"]},{\"default\":null,\"name\":\"Country\",\"type\":[\"null\",\"string\"]},{\"default\":null,\"name\":\"City\",\"type\":[\"null\",\"string\"]},{\"default\":null,\"name\":\"Specie\",\"type\":[\"null\",\"string\"]},{\"default\":null,\"name\":\"count\",\"type\":[\"null\",\"string\"]},{\"default\":null,\"name\":\"min\",\"type\":[\"null\",\"string\"]},{\"default\":null,\"name\":\"max\",\"type\":[\"null\",\"string\"]},{\"default\":null,\"name\":\"median\",\"type\":[\"null\",\"string\"]},{\"default\":null,\"name\":\"variance\",\"type\":[\"null\",\"string\"]}],\"name\":\"Value\",\"namespace\":\"com.github.jcustenborder.kafka.connect.model\",\"type\":\"record\"}";
Schema schema = new Schema.Parser().parse(valueSchema);
final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
FlinkKafkaConsumer kafkaConsumer = new FlinkKafkaConsumer<> ("v01", ConfluentRegistryAvroDeserializationSchema.forGeneric(schema, schemaRegistryUrl) , properties);
kafkaConsumer.setStartFromEarliest();
DataStream<String> streamIn = env.addSource(kafkaConsumer);
FlinkKafkaProducer kafkaProducer = new FlinkKafkaProducer("v02",ConfluentRegistryAvroSerializationSchema.forGeneric("v02-value",schema,schemaRegistryUrl),properties);
DataStream<String> streamOut = streamIn.map(new MapFunction<String, String>() { //<--Error in this line
#Override
public String map(String value) throws Exception {
return value.toUpperCase();
}
});
streamOut.addSink(kafkaProducer);
env.execute("Flink Streaming In/Out Kafka");
}
}
When executing I get following error:
Caused by: java.lang.ClassCastException: org.apache.avro.generic.GenericData$Record cannot be cast to java.lang.String
at org.example.KafkaRead$1.map(KafkaRead.java:45)
It works without problems if I dont use a map function and use my Input as my Output:
DataStream<String> streamOut = streamIn;
For Context:
The data that gets read in the first place looks like this
Date,Country,City,Specie,count,min,max,median,variance
2020-03-24,DE,Hamburg,humidity,288,26.0,54.0,36.5,966.48
2020-03-26,DE,Hamburg,humidity,288,25.0,71.5,44.0,1847.14
2020-04-01,DE,Hamburg,humidity,288,61.0,83.0,75.0,418.07
The csv file are read via the SpoolDirCsvSourceConnector in Kafka. Apache Flink reads from topic v01, which gets an automatic generated schema via the connector, which is saved as avro. In the next step I want to analyse it with filters inside flink and after that I want to write back to v02. For example filtering via country.
Complete Error:
Exception in thread "main" org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:147)
at org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$2(MiniClusterJobClient.java:119)
at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616)
at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)
at org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:229)
at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)
at org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:996)
at akka.dispatch.OnComplete.internal(Future.scala:264)
at akka.dispatch.OnComplete.internal(Future.scala:261)
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:191)
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:188)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
at org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:74)
at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44)
at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252)
at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:572)
at akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:22)
at akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:21)
at scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:436)
at scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:435)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91)
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91)
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:44)
at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: org.apache.flink.runtime.JobException: Recovery is suppressed by NoRestartBackoffTimeStrategy
at org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:116)
at org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:78)
at org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:224)
at org.apache.flink.runtime.scheduler.DefaultScheduler.maybeHandleTaskFailure(DefaultScheduler.java:217)
at org.apache.flink.runtime.scheduler.DefaultScheduler.updateTaskExecutionStateInternal(DefaultScheduler.java:208)
at org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:610)
at org.apache.flink.runtime.scheduler.SchedulerNG.updateTaskExecutionState(SchedulerNG.java:89)
at org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:419)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:286)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:201)
at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74)
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:154)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
at akka.actor.ActorCell.invoke(ActorCell.scala:561)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
at akka.dispatch.Mailbox.run(Mailbox.scala:225)
at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
... 4 more
Caused by: java.lang.ClassCastException: org.apache.avro.generic.GenericData$Record cannot be cast to java.lang.String
at org.example.KafkaRead$1.map(KafkaRead.java:45)
at org.apache.flink.streaming.api.operators.StreamMap.processElement(StreamMap.java:41)
at org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.pushToOperator(CopyingChainingOutput.java:71)
at org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:46)
at org.apache.flink.streaming.runtime.tasks.CopyingChainingOutput.collect(CopyingChainingOutput.java:26)
at org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:52)
at org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:30)
at org.apache.flink.streaming.api.operators.StreamSourceContexts$ManualWatermarkContext.processAndCollectWithTimestamp(StreamSourceContexts.java:310)
at org.apache.flink.streaming.api.operators.StreamSourceContexts$WatermarkContext.collectWithTimestamp(StreamSourceContexts.java:409)
at org.apache.flink.streaming.connectors.kafka.internals.AbstractFetcher.emitRecordsWithTimestamps(AbstractFetcher.java:352)
at org.apache.flink.streaming.connectors.kafka.internals.KafkaFetcher.partitionConsumerRecordsHandler(KafkaFetcher.java:181)
at org.apache.flink.streaming.connectors.kafka.internals.KafkaFetcher.runFetchLoop(KafkaFetcher.java:137)
at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.run(FlinkKafkaConsumerBase.java:761)
at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:100)
at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:63)
at org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:215)
Just to extend the comment that has been added. So, basically if You use ConfluentRegistryAvroDeserializationSchema.forGeneric the data produced my the consumer isn't really String but rather GenericRecord<T>.
So, the moment You will try to use it in Your map that expects String it will fail, because your DataStream is not DataStream<String> but rather DataStream<GenericRecord>.
Now, it works if You remove the map only because You havent specified the type when defining FlinkKafkaConsumer and your FlinkKafkaProducer, so Java will just try to cast every object to required type. Your FlinkKafkaProducer is actually FlinkKafkaProducer<GenericRecord> so there will be no problem there and thus it will work as it should.
In this particular case You don't seem to be needing Avro at all, since the data is just raw CSV.
UPDATE:
Seems that You are actually processing Avro, in this case You need to change the type of Your DataStream<String> to DataStream<GenericRecord> and all the functions You gonna write are going to work using GenericRecord not String.
So, You need something like:
.map(new MapFunction<GenericRecord, T>(){...})
I am developing Spring boot server with Spring kafka(1.3.2.RELEASE), apache avro(1.8.2) and io.confluent's Schema Registry(3.1.2). So evenytime the kafka listener gets a kafka message, it will find the schema id in message and get the avro schema from the registry server by id. The problem is, if the scheme registry config server is down, my listener will keep trying to send http request to the registry server to get the avro schema when it get a message(also prints large amount of error log), and it will block all the next kafka message since the offset won't move on.
16:56:41.541 ERROR KafkaMessageListenerContainer$ListenerConsumer - - org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1 - Container exception
org.apache.kafka.common.errors.SerializationException: Error deserializing key/value for partition trade-0 at offset 810845
Caused by: org.apache.kafka.common.errors.SerializationException: Error deserializing Avro message for id 21
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at java.net.Socket.connect(Socket.java:538)
at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:242)
at sun.net.www.http.HttpClient.New(HttpClient.java:339)
at sun.net.www.http.HttpClient.New(HttpClient.java:357)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1202)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1138)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1032)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:966)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1546)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1474)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
at io.confluent.kafka.schemaregistry.client.rest.RestService.sendHttpRequest(RestService.java:153)
at io.confluent.kafka.schemaregistry.client.rest.RestService.httpRequest(RestService.java:187)
at io.confluent.kafka.schemaregistry.client.rest.RestService.getId(RestService.java:323)
at io.confluent.kafka.schemaregistry.client.rest.RestService.getId(RestService.java:316)
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.getSchemaByIdFromRegistry(CachedSchemaRegistryClient.java:63)
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.getBySubjectAndID(CachedSchemaRegistryClient.java:118)
at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.deserialize(AbstractKafkaAvroDeserializer.java:121)
at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.deserialize(AbstractKafkaAvroDeserializer.java:92)
at io.confluent.kafka.serializers.KafkaAvroDeserializer.deserialize(KafkaAvroDeserializer.java:54)
at org.apache.kafka.common.serialization.ExtendedDeserializer$Wrapper.deserialize(ExtendedDeserializer.java:65)
at org.apache.kafka.common.serialization.ExtendedDeserializer$Wrapper.deserialize(ExtendedDeserializer.java:55)
at org.apache.kafka.clients.consumer.internals.Fetcher.parseRecord(Fetcher.java:918)
at org.apache.kafka.clients.consumer.internals.Fetcher.access$2600(Fetcher.java:93)
at org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.fetchRecords(Fetcher.java:1095)
at org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.access$1200(Fetcher.java:944)
at org.apache.kafka.clients.consumer.internals.Fetcher.fetchRecords(Fetcher.java:567)
at org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:528)
at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1086)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1043)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.run(KafkaMessageListenerContainer.java:614)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
I have tried to use RetryTemplate to set the max attempts but it didn't work, It seems that the RetryTemplate may only works in my listener method. Also I didn't find any helpful config in the io confluent's website.
Now I replace the KafkaAvroDeserializer by using a CustomAvroDeserializer, which extends the KafkaAvroDeserializer and override its deserialize method with adding a try-catch to its content, like this:
#Log4j
public class CustomAvroDeserializer extends KafkaAvroDeserializer {
#Override
public Object deserialize(String s, byte[] bytes) {
try {
return this.deserialize(bytes);
} catch (Exception e) {
log.error("encounter a problem when deserializer message with schema registry:{}", e);
return null;
}
}
}
I'm trying to do some PoC on "exactly one delivery" concept with Apache Kafka using Spring Cloud Streams + Kafka Binding.
I installed Apache Kafka "kafka_2.11-1.0.0" and defined "transactionIdPrefix" in the producer, which I understand is the only thing I need to do to enable transactions in Spring Kafka, but when I do that and run simple Source & Sink bindings within the same application, I see some messages are received and printed in the consumer and some get an error.
For example, message #6 received:
[49] Received message [Payload String content=FromSource1 6][Headers={kafka_offset=1957, scst_nativeHeadersPresent=true, kafka_consumer=org.apache.kafka.clients.consumer.KafkaConsumer#6695c9a9, kafka_timestampType=CREATE_TIME, my-transaction-id=my-id-6, id=302cf3ef-a154-fd42-6b43-983778e275dc, kafka_receivedPartitionId=0, contentType=application/json, kafka_receivedTopic=test10, kafka_receivedTimestamp=1514384106395, timestamp=1514384106419}]
but message #7 had an error "Invalid transition attempted from state IN_TRANSACTION to state IN_TRANSACTION":
2017-12-27 16:15:07.405 ERROR 7731 --- [ask-scheduler-4] o.s.integration.handler.LoggingHandler : org.springframework.messaging.MessageHandlingException: error occurred in message handler [org.springframework.cloud.stream.binder.kafka.KafkaMessageChannelBinder$ProducerConfigurationMessageHandler#7d3bbc0b]; nested exception is org.apache.kafka.common.KafkaException: TransactionalId my-transaction-3: Invalid transition attempted from state IN_TRANSACTION to state IN_TRANSACTION, failedMessage=GenericMessage [payload=byte[13], headers={my-transaction-id=my-id-7, id=d31656af-3286-99b0-c736-d53aa57a5e65, contentType=application/json, timestamp=1514384107399}]
at org.springframework.integration.handler.AbstractMessageHandler.handleMessage(AbstractMessageHandler.java:153)
at org.springframework.cloud.stream.binder.AbstractMessageChannelBinder$SendingHandler.handleMessageInternal(AbstractMessageChannelBinder.java:575)
What does this error means?
Is Something missing with my configuration?
Do I need to implement my the Source or the Sink differently when transactions is enabled?
UPDATE:
I opened an issue on the project's github, please refer to the discussion there.
Couldn't find an example of how to use Spring Cloud Stream with Kafka binding + Trasanctions enabled
To reproduce, need to created a simple maven project with spring boot version "2.0.0.M5" and "spring-cloud-stream-dependencies" version "Elmhurst.M3", and to created a simple application with this configuration:
server:
port: 8082
spring:
kafka:
producer:
retries: 5555
acks: "all"
cloud:
stream:
kafka:
binder:
autoAddPartitions: true
transaction:
transactionIdPrefix: my-transaction-
bindings:
output1:
destination: test10
group: test111
binder: kafka
input1:
destination: test10
group: test111
binder: kafka
consumer:
partitioned: true
I also created simple Source and Sink classes:
#EnableBinding(SampleSink.MultiInputSink.class)
public class SampleSink {
#StreamListener(MultiInputSink.INPUT1)
public synchronized void receive1(Message<?> message) {
System.out.println("["+Thread.currentThread().getId()+"] Received message " + message);
}
public interface MultiInputSink {
String INPUT1 = "input1";
#Input(INPUT1)
SubscribableChannel input1();
}
}
and:
#EnableBinding(SampleSource.MultiOutputSource.class)
public class SampleSource {
AtomicInteger atomicInteger = new AtomicInteger(1);
#Bean
#InboundChannelAdapter(value = MultiOutputSource.OUTPUT1, poller = #Poller(fixedDelay = "1000", maxMessagesPerPoll = "1"))
public synchronized MessageSource<String> messageSource1() {
return new MessageSource<String>() {
public Message<String> receive() {
String message = "FromSource1 "+atomicInteger.getAndIncrement();
m.put("my-transaction-id","my-id-"+ UUID.randomUUID());
return new GenericMessage(message, new MessageHeaders(m));
}
};
}
public interface MultiOutputSource {
String OUTPUT1 = "output1";
#Output(OUTPUT1)
MessageChannel output1();
}
}
I opened a ticket on that to the project's github.
Please refer to the answers and discussion there:
https://github.com/spring-cloud/spring-cloud-stream/issues/1166
but the first answer there was:
The binder doesn't currently support producer-initiated transactions.
Transactions are supported for processors (where the consumer starts
the transaction and the producer participates in that transaction).
You should be able to use spring-kafka directly to initiate a
transaction on the producer side when there is no consumer.
I am using spark streaming v2.0.0 and kafka v08.2.1
I am trying to connect spark streaming to a remote kafka server by using eclipse.
To create the topic I used the following guide: http://kafka.apache.org/documentation.html#quickstart
I used the example provided in spark streaming to write the following code:
package sparkstreamingtest;
import java.util.Arrays;
import java.util.HashMap;
import java.util.HashSet;
import java.util.regex.Pattern;
import kafka.serializer.StringDecoder;
import org.apache.spark.streaming.Durations;
import org.apache.spark.streaming.api.java.JavaDStream;
import org.apache.spark.streaming.api.java.JavaPairInputDStream;
import org.apache.spark.streaming.api.java.JavaStreamingContext;
import org.apache.spark.streaming.kafka.KafkaUtils;
import scala.Tuple2;
public class test1 {
public static void main(String[] args) throws InterruptedException {
String brokers = "10.66.125.130:9092";//args[0];//on récupère hosts:port
String topics = "test2";//args[1];//on récupère les topics de kafka: topic1,topic2,..
// Create context with a 2 seconds batch interval
SparkConf sparkConf = new SparkConf().setAppName("JavaDirectKafkaWordCount").setMaster("local[*]");
JavaStreamingContext jssc = new JavaStreamingContext(sparkConf, Durations.seconds(2));
HashSet<String> topicsSet = new HashSet<String>(Arrays.asList(topics.split(",")));//on sépare les topics pour les mettre dans un même set
System.out.println(topicsSet);
HashMap<String, String> kafkaParams = new HashMap<String, String>();
kafkaParams.put("metadata.broker.list", brokers);//même chose pour les hosts (brokers)
System.out.println(kafkaParams);
// Create direct kafka stream with brokers and topics
JavaPairInputDStream<String, String> messages = KafkaUtils.createDirectStream(
jssc,
String.class,
String.class,
StringDecoder.class,
StringDecoder.class,
kafkaParams,
topicsSet
);
// Get the lines, split them into words, count the words and print
JavaDStream<String> lines = messages.map(new Function<Tuple2<String, String>, String>() {
#Override
public String call(Tuple2<String, String> tuple2) {
return tuple2._2();
}
});
// Start the computation
jssc.start();
jssc.awaitTermination();
}
}
I am getting the following error that I don't understand:
16/08/19 10:53:58 INFO SimpleConsumer: Reconnect due to socket error: java.nio.channels.ClosedChannelException
Exception in thread "main" org.apache.spark.SparkException: java.nio.channels.ClosedChannelException
org.apache.spark.SparkException: Couldn't find leader offsets for Set([test2,0])
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$checkErrors$1.apply(KafkaCluster.scala:373)
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$checkErrors$1.apply(KafkaCluster.scala:373)
at scala.util.Either.fold(Either.scala:98)
at org.apache.spark.streaming.kafka.KafkaCluster$.checkErrors(KafkaCluster.scala:372)
at org.apache.spark.streaming.kafka.KafkaUtils$.getFromOffsets(KafkaUtils.scala:222)
at org.apache.spark.streaming.kafka.KafkaUtils$.createDirectStream(KafkaUtils.scala:484)
at org.apache.spark.streaming.kafka.KafkaUtils$.createDirectStream(KafkaUtils.scala:607)
at org.apache.spark.streaming.kafka.KafkaUtils.createDirectStream(KafkaUtils.scala)
at sparkstreamingtest.test1.main(test1.java:64)
16/08/19 10:53:59 INFO SparkContext: Invoking stop() from shutdown hook
I can't achieve to solve even if I followed the former stackoverflow subjects:
Spark Streaming + Kafka: SparkException: Couldn't find leader offsets for Set
Exception in thread "main" org.apache.spark.SparkException: org.apache.spark.SparkException: Couldn't find leaders for Set()-Spark Steaming-kafka
Cannot connect from Spark Streaming to Kafka: org.apache.spark.SparkException: java.net.SocketTimeoutException
How to manually commit offset in Spark Kafka direct streaming?
can anyone help me in this issue ?
Thank you for your attention and your help.
Joe