Kafka and Pinot -- Message not sent - apache-kafka

I am trying to ingres data through kafka and json file and running this command:
bin/kafka-console-producer.sh --broker-list localhost:19092 --topic mytopic < $PDATA_HOME/opt_flatten_json.json
But I am getting error:
Exception while executing a state transition task mystats__0__0__20210319T0430Z
java.lang.reflect.InvocationTargetException: null
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_282]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_282]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_282]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_282]
at org.apache.helix.messaging.handling.HelixStateTransitionHandler.invoke(HelixStateTransitionHandler.java:404) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d87755899eccba3554e9cc39a1439d5ecb53aaac]
at org.apache.helix.messaging.handling.HelixStateTransitionHandler.handleMessage(HelixStateTransitionHandler.java:331) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d87755899eccba3554e9cc39a1439d5ecb53aaac]
at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:97) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d87755899eccba3554e9cc39a1439d5ecb53aaac]
at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:49) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d87755899eccba3554e9cc39a1439d5ecb53aaac]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_282]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_282]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_282]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282]
Caused by: java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:695) ~[?:1.8.0_282]
at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123) ~[?:1.8.0_282]
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311) ~[?:1.8.0_282]
at org.apache.pinot.core.segment.memory.PinotByteBuffer.allocateDirect(PinotByteBuffer.java:39) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d87755899eccba3554e9cc39a1439d5ecb53aaac]
at org.apache.pinot.core.segment.memory.PinotDataBuffer.allocateDirect(PinotDataBuffer.java:116) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d87755899eccba3554e9cc39a1439d5ecb53aaac]
at org.apache.pinot.core.io.writer.impl.DirectMemoryManager.allocateInternal(DirectMemoryManager.java:53) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d87755899eccba3554e9cc39a1439d5ecb53aaac]
at org.apache.pinot.core.io.readerwriter.RealtimeIndexOffHeapMemoryManager.allocate(RealtimeIndexOffHeapMemoryManager.java:79) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d87755899eccba3554e9cc39a1439d5ecb53aaac]
at org.apache.pinot.core.realtime.impl.forward.FixedByteMVMutableForwardIndex.addDataBuffer(FixedByteMVMutableForwardIndex.java:162) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d87755899eccba3554e9cc39a1439d5ecb53aaac]
at org.apache.pinot.core.realtime.impl.forward.FixedByteMVMutableForwardIndex.<init>(FixedByteMVMutableForwardIndex.java:137) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d87755899eccba3554e9cc39a1439d5ecb53aaac]
at org.apache.pinot.core.indexsegment.mutable.MutableSegmentImpl.<init>(MutableSegmentImpl.java:307) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d87755899eccba3554e9cc39a1439d5ecb53aaac]
at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.<init>(LLRealtimeSegmentDataManager.java:1270) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d87755899eccba3554e9cc39a1439d5ecb53aaac]
at org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.addSegment(RealtimeTableDataManager.java:324) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d87755899eccba3554e9cc39a1439d5ecb53aaac]
at org.apache.pinot.server.starter.helix.HelixInstanceDataManager.addRealtimeSegment(HelixInstanceDataManager.java:132) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d87755899eccba3554e9cc39a1439d5ecb53aaac]
at org.apache.pinot.server.starter.helix.SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel.onBecomeOnlineFromOffline(SegmentOnlineOfflineStateModelFactory.java:164) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d87755899eccba3554e9cc39a1439d5ecb53aaac]
at org.apache.pinot.server.starter.helix.SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel.onBecomeConsumingFromOffline(SegmentOnlineOfflineStateModelFactory.java:88) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-d87755899eccba3554e9cc39a1439d5ecb53aaac]
... 12 more
Default rollback method invoked on error. Error Code: ERROR
Message execution failed. msgId: eed5b297-ea20-437e-a0b5-ad4d0be75c3c, errorMsg: java.lang.reflect.InvocationTargetException
Skip internal error. errCode: ERROR, errMsg: null
Event bcbad381_DEFAULT : Unable to find a next state for resource: mystats_REALTIME partition: mystats__0__0__20210319T0430Z from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:CONSUMING
Event c910d226_DEFAULT : Unable to find a next state for resource: mystats_REALTIME partition: mystats__0__0__20210319T0430Z from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:CONSUMING
Event d194950f_DEFAULT : Unable to find a next state for resource: mystats_REALTIME partition: mystats__0__0__20210319T0430Z from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:CONSUMING

Looks like your Kafka JVM ran out of memory:
Caused by: java.lang.OutOfMemoryError: Direct buffer memory
This issue seems related to this ticket.
You should set the Environment Variable for Kafka appropriately to your memory needs: f.i. -XX:MaxDirectMemorySize=512m

Seems like your Pinot ran out of memory buffer for the realtime table segment. Extra info about the segment and realtime table config is needed to debug further.

Related

Error Pyspark :error occurred while calling None.org.apache.spark.api.java.JavaSparkContext

I know this question has been posted before, even after trying multiple solutions I still get the same error
spark = SparkSession.builder.appName('K_mean').getOrCreate()
`
An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: org.apache.spark.SparkException: Invalid Spark URL: spark://HeartbeatReceiver#HP:5
at org.apache.spark.rpc.RpcEndpointAddress$.apply(RpcEndpointAddress.scala:66)
at org.apache.spark.rpc.netty.NettyRpcEnv.asyncSetupEndpointRefByURI(NettyRpcEnv.scala:140)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:102)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:110)
at org.apache.spark.util.RpcUtils$.makeDriverRef(RpcUtils.scala:36)
at org.apache.spark.executor.Executor.<init>(Executor.scala:244)
at org.apache.spark.scheduler.local.LocalEndpoint.<init>(LocalSchedulerBackend.scala:64)
at org.apache.spark.scheduler.local.LocalSchedulerBackend.start(LocalSchedulerBackend.scala:132)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:222)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:585)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:238)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
at java.base/java.lang.Thread.run(Thread.java:834)
`
I tried reinstalling spark. I changed java version to 8 but still the same problem.

NiFi Writing to HDFS Error: java.lang.IllegalArgumentException: Can not create a Path from an empty string

I am facing a problem with NiFi writing to HDFS. I am getting an error:
ERROR [Timer-Driven Process Thread-10] o.apache.nifi.processors.hadoop.PutHDFS PutHDFS[id=4af43efa-a8ff-18ac-0000-00002377fba5] Failed to properly initialize Processor. If still scheduled to run, NiFi will attempt to initialize and run the Processor again after the 'Administrative Yield Duration' has elapsed. Failure is due to java.lang.reflect.InvocationTargetException: java.lang.reflect.InvocationTargetException
java.lang.reflect.InvocationTargetException: null
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:137)
at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:125)
at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotations(ReflectionUtils.java:70)
at org.apache.nifi.util.ReflectionUtils.invokeMethodsWithAnnotation(ReflectionUtils.java:47)
at org.apache.nifi.controller.StandardProcessorNode.lambda$initiateStart$1(StandardProcessorNode.java:1364)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalArgumentException: Can not create a Path from an empty string
at org.apache.hadoop.fs.Path.checkPathArg(Path.java:126)
at org.apache.hadoop.fs.Path.<init>(Path.java:134)
at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.getConfigurationFromResources(AbstractHadoopProcessor.java:225)
at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.resetHDFSResources(AbstractHadoopProcessor.java:254)
at org.apache.nifi.processors.hadoop.AbstractHadoopProcessor.abstractOnScheduled(AbstractHadoopProcessor.java:205)
... 15 common frames omitted
My HDFS configuration is:
Note: same configuration were applied on PutFile and it worked perfectly (Kafka.topic was not empty)
It seems when PutHDFS processor is trying to save the file looking for kafka.topic attribute associated with the flowfile but attribute not having any value to it.
Make sure you are having kafka.topic attribute having some value associate to it, we can use UpdataAttribute processor before PutHDFS to add the attribute.

debezium error: "option "include-unchanged-toast" = "0" is unknown"

I have a question about debezium and I was stuck for more than two weeks.
I use debezium to pull data from remote PostgreSQL every minute, and I always see the below error as output:
"option "include-unchanged-toast" = "0" is unknown".
the environment is as follows:
debezium 0.9.5 final
postgresql 10.8
kafka 2.1.1
the connector configuration is as follows:
name=diansheng340831
slot.name=diansheng340831
connector.class=io.debezium.connector.postgresql.PostgresConnector
database.hostname=*.*.*
database.port=5432
database.user=*
database.password=*
database.dbname=*
database.history.kafka.bootstrap.servers=localhost:9092
database.server.name=diansheng340831
plugin.name=wal2json
table.whitelist=*
transforms=dropFieldByValue
transforms.dropFieldByValue.type=org.apache.kafka.connect.transforms.ReplaceField$Value
transforms.dropFieldByValue.whitelist=after
max.batch.size=5120
snapshot.fetch.size=5120
database.tcpKeepAlive=true
schema.refresh.mode=columns_diff_exclude_unchanged_toast
max.queue.size=202400
snapshot.mode=never
the error is as follows:
ERROR unexpected exception while streaming logical changes (io.debezium.connector.postgresql.RecordsStreamProducer:147)
org.postgresql.util.PSQLException: ERROR: option "include-unchanged-toast" = "0" is unknown
Where: slot "guangdian10831", output plugin "wal2json", in the startup callback
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2440)
at org.postgresql.core.v3.QueryExecutorImpl.processCopyResults(QueryExecutorImpl.java:1116)
at org.postgresql.core.v3.QueryExecutorImpl.readFromCopy(QueryExecutorImpl.java:1035)
at org.postgresql.core.v3.CopyDualImpl.readFromCopy(CopyDualImpl.java:41)
at org.postgresql.core.v3.replication.V3PGReplicationStream.receiveNextData(V3PGReplicationStream.java:155)
at org.postgresql.core.v3.replication.V3PGReplicationStream.readInternal(V3PGReplicationStream.java:124)
at org.postgresql.core.v3.replication.V3PGReplicationStream.read(V3PGReplicationStream.java:70)
at io.debezium.connector.postgresql.connection.PostgresReplicationConnection$1.read(PostgresReplicationConnection.java:264)
at io.debezium.connector.postgresql.RecordsStreamProducer.streamChanges(RecordsStreamProducer.java:140)
at io.debezium.connector.postgresql.RecordsStreamProducer.lambda$start$0(RecordsStreamProducer.java:123)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[2019-08-22 11:16:23,425] INFO WorkerSourceTask{id=guangdian10831-0} Committing offsets (org.apache.kafka.connect.runtime.WorkerSourceTask:397)
[2019-08-22 11:16:23,426] INFO WorkerSourceTask{id=guangdian10831-0} flushing 0 outstanding messages for offset commit (org.apache.kafka.connect.runtime.WorkerSourceTask:414)
[2019-08-22 11:16:23,426] ERROR WorkerSourceTask{id=guangdian10831-0} Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:177)
org.apache.kafka.connect.errors.ConnectException: An exception ocurred in the change event producer. This connector will be stopped.
at io.debezium.connector.base.ChangeEventQueue.throwProducerFailureIfPresent(ChangeEventQueue.java:170)
at io.debezium.connector.base.ChangeEventQueue.poll(ChangeEventQueue.java:151)
at io.debezium.connector.postgresql.PostgresConnectorTask.poll(PostgresConnectorTask.java:161)
at org.apache.kafka.connect.runtime.WorkerSourceTask.poll(WorkerSourceTask.java:244)
at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:220)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:175)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:219)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Can somebody help me?
I have solved the problem.
I commit the following code in debezium source code and rebuild it.
#Override
public ChainedLogicalStreamBuilder tryOnceOptions(ChainedLogicalStreamBuilder builder) {
// return builder.withSlotOption("include-unchanged-toast", 0); ---original
return builder; ---modified
}

Spark GPUenabler ClassNotFoundException: CacheGPU

I am using the package IBMSparkGPU/GPUenabler package. I use sbt assembly to package all the dependency into one single jar file and submit it to the spark standalone cluster manager. However, the following error message appear:
org.apache.spark.SparkException: Error sending message [message = CacheGPUDS(a99176e95cf37ba4e5e46b9b172369ac_-1728716590,false)]
at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:119)
at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:78)
at com.ibm.gpuenabler.GPUMemoryManagerMasterEndPoint.com$ibm$gpuenabler$GPUMemoryManagerMasterEndPoint$$tell(GPUMemoryManager.scala:172)
at com.ibm.gpuenabler.GPUMemoryManagerMasterEndPoint$$anonfun$registerGPUMemoryManager$2.apply(GPUMemoryManager.scala:64)
at com.ibm.gpuenabler.GPUMemoryManagerMasterEndPoint$$anonfun$registerGPUMemoryManager$2.apply(GPUMemoryManager.scala:64)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
at scala.collection.mutable.ListBuffer.foreach(ListBuffer.scala:45)
at com.ibm.gpuenabler.GPUMemoryManagerMasterEndPoint.registerGPUMemoryManager(GPUMemoryManager.scala:64)
at com.ibm.gpuenabler.GPUMemoryManagerMasterEndPoint$$anonfun$receiveAndReply$1.applyOrElse(GPUMemoryManager.scala:143)
at org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:105)
at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:205)
at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:101)
at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:213)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:77)
at org.apache.spark.rpc.RpcTimeout$$anonfun$1.applyOrElse(RpcTimeout.scala:75)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:102)
... 16 more
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: com.ibm.gpuenabler.CacheGPUDS
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1866)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1749)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2040)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1571)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2285)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2209)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2067)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1571)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:431)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:108)
at org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$deserialize$1$$anonfun$apply$1.apply(NettyRpcEnv.scala:259)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
at org.apache.spark.rpc.netty.NettyRpcEnv.deserialize(NettyRpcEnv.scala:308)
at org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$deserialize$1.apply(NettyRpcEnv.scala:258)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
at org.apache.spark.rpc.netty.NettyRpcEnv.deserialize(NettyRpcEnv.scala:257)
at org.apache.spark.rpc.netty.NettyRpcHandler.internalReceive(NettyRpcEnv.scala:577)
at org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:562)
at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:159)
at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:107)
at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:119)
at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
... more
There won't be this error message and the program can run if I submit to local[*] as master. I can also eliminate the error when disable the spark.gpuenabler.autocache. However, is there other way to properly fix the issue?
I am using Ubuntu 17.04, JRE 1.8.0, Scala 2.11 and Spark 2.1.0.
It turned out that adding the option
--conf "spark.executor.extraClassPath=file://path/to/jar" will solve the problem. Another thing is that I need to paste the jar file to all the machine with same path. Other wise the worker will not be able to get the jar file.

put audio file in kafka with spring cloud stream

i'm tryning to put an audio file with json in kafka,
her is the code
producer Code
In the consumer i'm trying to get my file like this Consumer code
the error :
org.springframework.messaging.MessagingException: Exception thrown while invoking com.sofrecom.service.VoiceCampaignCreator#process[1 args]; nested exception is java.lang.NullPointerException
at org.springframework.cloud.stream.binding.StreamListenerMessageHandler.handleRequestMessage(StreamListenerMessageHandler.java:56)
at org.springframework.integration.handler.AbstractReplyProducingMessageHandler.handleMessageInternal(AbstractReplyProducingMessageHandler.java:109)
at org.springframework.integration.handler.AbstractMessageHandler.handleMessage(AbstractMessageHandler.java:127)
at org.springframework.cloud.stream.binding.DispatchingStreamListenerMessageHandler$ConditionalStreamListenerHandler.handleMessage(DispatchingStreamListenerMessageHandler.java:122)
at org.springframework.cloud.stream.binding.DispatchingStreamListenerMessageHandler.handleRequestMessage(DispatchingStreamListenerMessageHandler.java:75)
at org.springframework.integration.handler.AbstractReplyProducingMessageHandler.handleMessageInternal(AbstractReplyProducingMessageHandler.java:109)
at org.springframework.integration.handler.AbstractMessageHandler.handleMessage(AbstractMessageHandler.java:127)
at org.springframework.integration.dispatcher.AbstractDispatcher.tryOptimizedDispatch(AbstractDispatcher.java:116)
at org.springframework.integration.dispatcher.UnicastingDispatcher.doDispatch(UnicastingDispatcher.java:148)
at org.springframework.integration.dispatcher.UnicastingDispatcher.dispatch(UnicastingDispatcher.java:121)
at org.springframework.integration.channel.AbstractSubscribableChannel.doSend(AbstractSubscribableChannel.java:89)
at org.springframework.integration.channel.AbstractMessageChannel.send(AbstractMessageChannel.java:423)
at org.springframework.integration.channel.AbstractMessageChannel.send(AbstractMessageChannel.java:373)
at org.springframework.messaging.core.GenericMessagingTemplate.doSend(GenericMessagingTemplate.java:115)
at org.springframework.messaging.core.GenericMessagingTemplate.doSend(GenericMessagingTemplate.java:45)
at org.springframework.messaging.core.AbstractMessageSendingTemplate.send(AbstractMessageSendingTemplate.java:105)
at org.springframework.integration.handler.AbstractMessageProducingHandler.sendOutput(AbstractMessageProducingHandler.java:292)
at org.springframework.integration.handler.AbstractMessageProducingHandler.produceOutput(AbstractMessageProducingHandler.java:212)
at org.springframework.integration.handler.AbstractMessageProducingHandler.sendOutputs(AbstractMessageProducingHandler.java:129)
at org.springframework.integration.handler.AbstractReplyProducingMessageHandler.handleMessageInternal(AbstractReplyProducingMessageHandler.java:115)
at org.springframework.integration.handler.AbstractMessageHandler.handleMessage(AbstractMessageHandler.java:127)
at org.springframework.integration.channel.FixedSubscriberChannel.send(FixedSubscriberChannel.java:70)
at org.springframework.integration.channel.FixedSubscriberChannel.send(FixedSubscriberChannel.java:64)
at org.springframework.messaging.core.GenericMessagingTemplate.doSend(GenericMessagingTemplate.java:115)
at org.springframework.messaging.core.GenericMessagingTemplate.doSend(GenericMessagingTemplate.java:45)
at org.springframework.messaging.core.AbstractMessageSendingTemplate.send(AbstractMessageSendingTemplate.java:105)
at org.springframework.integration.endpoint.MessageProducerSupport.sendMessage(MessageProducerSupport.java:171)
at org.springframework.integration.kafka.inbound.KafkaMessageDrivenChannelAdapter.access$000(KafkaMessageDrivenChannelAdapter.java:54)
at org.springframework.integration.kafka.inbound.KafkaMessageDrivenChannelAdapter$IntegrationRecordMessageListener.onMessage(KafkaMessageDrivenChannelAdapter.java:288)
at org.springframework.integration.kafka.inbound.KafkaMessageDrivenChannelAdapter$IntegrationRecordMessageListener.onMessage(KafkaMessageDrivenChannelAdapter.java:279)
at org.springframework.kafka.listener.adapter.RetryingAcknowledgingMessageListenerAdapter$1.doWithRetry(RetryingAcknowledgingMessageListenerAdapter.java:77)
at org.springframework.kafka.listener.adapter.RetryingAcknowledgingMessageListenerAdapter$1.doWithRetry(RetryingAcknowledgingMessageListenerAdapter.java:72)
at org.springframework.retry.support.RetryTemplate.doExecute(RetryTemplate.java:286)
at org.springframework.retry.support.RetryTemplate.execute(RetryTemplate.java:179)
at org.springframework.kafka.listener.adapter.RetryingAcknowledgingMessageListenerAdapter.onMessage(RetryingAcknowledgingMessageListenerAdapter.java:72)
at org.springframework.kafka.listener.adapter.RetryingAcknowledgingMessageListenerAdapter.onMessage(RetryingAcknowledgingMessageListenerAdapter.java:39)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.invokeRecordListener(KafkaMessageListenerContainer.java:771)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.invokeListener(KafkaMessageListenerContainer.java:715)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.access$2600(KafkaMessageListenerContainer.java:231)
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer$ListenerInvoker.run(KafkaMessageListenerContainer.java:1004)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException: null
at org.apache.tomcat.util.http.fileupload.disk.DiskFileItem.getSize(DiskFileItem.java:267)
at org.apache.catalina.core.ApplicationPart.getSize(ApplicationPart.java:110)
at org.springframework.web.multipart.support.StandardMultipartHttpServletRequest$StandardMultipartFile.getSize(StandardMultipartHttpServletRequest.java:287)
at com.sofrecom.service.VoiceCampaignCreator.process(VoiceCampaignCreator.java:44)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.messaging.handler.invocation.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:180)
at org.springframework.messaging.handler.invocation.InvocableHandlerMethod.invoke(InvocableHandlerMethod.java:112)
at org.springframework.cloud.stream.binding.StreamListenerMessageHandler.handleRequestMessage(StreamListenerMessageHandler.java:48)
... 42 common frames omitted
any help please !
The problem seems pretty obvious...
Caused by: java.lang.NullPointerException: null at org.apache.tomcat.util.http.fileupload.disk.DiskFileItem.getSize(DiskFileItem.java:267) at org.apache.catalina.core.ApplicationPart.getSize(ApplicationPart.java:110) at org.springframework.web.multipart.support.StandardMultipartHttpServletRequest$StandardMultipartFile.getSize(StandardMultipartHttpServletRequest.java:287) at com.sofrecom.service.VoiceCampaignCreator.process(VoiceCampaignCreator.java:44)
It looks like you are trying to decode a web request outside of a web environment. You need to decode the multipart before sending the data to Kafka.
EDIT
In DiskFileItem...
private transient DeferredFileOutputStream dfos;
... dfos is transient - so it won't get serialized (obviously - because it's a Stream).