scala code throw exception in spark - scala

I am new to scala and spark. Today I tried to write some code, and let it run on spark, but got an exception.
this code work in local scala
import org.apache.commons.lang.time.StopWatch
import org.apache.spark.{SparkConf, SparkContext}
import scala.collection.mutable.ListBuffer
import scala.util.Random
def test(): List[Int] = {
val size = 100
val range = 100
var listBuffer = new ListBuffer[Int] // here throw an exception
val random = new Random()
for (i <- 1 to size)
listBuffer += random.nextInt(range)
listBuffer.foreach(x => println(x))
listBuffer.toList
}
but when I put this code into spark, it throw an exception says:
15/01/01 14:06:17 INFO SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
Exception in thread "main" java.lang.NoSuchMethodError: scala.runtime.ObjectRef.create(Ljava/lang/Object;)Lscala/runtime/ObjectRef;
at com.tudou.sortedspark.Sort$.test(Sort.scala:35)
at com.tudou.sortedspark.Sort$.sort(Sort.scala:23)
at com.tudou.sortedspark.Sort$.main(Sort.scala:14)
at com.tudou.sortedspark.Sort.main(Sort.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
if I comment out the below code, the code work in spark
for (i <- 1 to size)
can someone explain why, please.

Thanks #Imm, I have solved this issue. The root cause is that my local scala is 2.11.4, but my spark cluster is running at 1.2.0 version. The 1.2 version of spark was compiled by 2.10 scala.
So the solution is compile local code by 2.10 scala, and upload the compiled jar into spark. Everything works fine.

Related

Unable to create a stream in spark-streaming using kinesis stream

I am new to kinesis and i am trying to process the kinesis stream data with spark-streaming (Pyspark) and facing the below error
Below is my code: I am pushing twitter data to my kinesis stream and trying to process using Spark-streaming. I tried including --jars with all dependencies but still facing the same issue.Spark version -2.4.3 and also 2.3.3 with appropriate spark-streaming-kinesis-asl-assembly.jar
from pyspark.streaming.kinesis import KinesisUtils, InitialPositionInStream
from pyspark import SparkConf,SparkContext
from pyspark.sql import SparkSession
from pyspark import StorageLevel
from pyspark.streaming import StreamingContext
from pyspark.streaming.kinesis import KinesisUtils,
InitialPositionInStream
spark_session = SparkSession.builder.getOrCreate()
ssc = StreamingContext(spark_session.sparkContext, 10)
sc = spark_session.sparkContext
Kinesis_app_name = "test"
Kinesis_stream_name = "python-stream"
endpoint_url = "https://kinesis.us-east-1.amazonaws.com"
region_name = "us-east-1"
data = KinesisUtils.createStream(
ssc, Kinesis_app_name, Kinesis_stream_name, endpoint_url,
region_name, InitialPositionInStream.LATEST, 10, StorageLevel.MEMORY_AND_DISK_2)
data.pprint()
ssc.start() # Start the computation
ssc.awaitTermination()
I would like to process the stream using spark-streaming but getting the below error :
File "C:\spark-2.3.3-bin-hadoop2.7\python\lib\pyspark.zip\pyspark\streaming\kinesis.py", line 92, in createStream
File "C:\spark-2.3.3-bin-hadoop2.7\python\lib\py4j-0.10.7-src.zip\py4j\java_gateway.py", line 1257, in __call__
File "C:\spark-2.3.3-bin-hadoop2.7\python\lib\py4j-0.10.7-src.zip\py4j\protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o27.createStream.
: java.lang.NoClassDefFoundError: com/amazonaws/services/kinesis/model/Record
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
at java.lang.Class.getDeclaredMethods(Class.java:1975)
at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:232)
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:159)
at org.apache.spark.SparkContext.clean(SparkContext.scala:2299)
at org.apache.spark.streaming.kinesis.KinesisUtils$.createStream(KinesisUtils.scala:127)
at org.apache.spark.streaming.kinesis.KinesisUtils$.createStream(KinesisUtils.scala:554)
at org.apache.spark.streaming.kinesis.KinesisUtilsPythonHelper.createStream(KinesisUtils.scala:616)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.ClassNotFoundException: com.amazonaws.services.kinesis.model.Record
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 20 more
I ran into the same issue. It ended up being that I had included just the spark-streaming-kinesis-asl jar. This jar does not contain the kinesis sdk as far as I'm aware. I fixed it by removing the lone jar and then using the package manager for the dependency with the spark-submit argument --packages org.apache.spark:spark-streaming-kinesis-asl_2.12:2.4.4. If you use the package manager but do not remove the offending jar, the program will not work. I hope this helps all who come across this error in the future.
Please see the solution below:
from pyspark import SparkContext
from pyspark.streaming import StreamingContext
from pyspark.streaming.kinesis import KinesisUtils, InitialPositionInStream, StorageLevel
if __name__ == "__main__":
kinesisConf = {...} # I put all my credentials in here
batchInterval = 2000
kinesisCheckpointInterval = batchInterval
sc = SparkContext(appName="kinesis-stream")
ssc = StreamingContext(sc, batchInterval)
data = KinesisUtils.createStream(
ssc=ssc,
kinesisAppName=kinesisConf['appName'],
streamName=kinesisConf['streamName'],
endpointUrl=kinesisConf['endpointUrl'],
regionName=kinesisConf['regionName'],
initialPositionInStream=InitialPositionInStream.LATEST,
checkpointInterval=kinesisCheckpointInterval,
storageLevel=StorageLevel.MEMORY_AND_DISK_2,
awsAccessKeyId=kinesisConf['awsAccessKeyId'],
awsSecretKey=kinesisConf['awsSecretKey']
)
data.pprint()
ssc.start()
ssc.awaitTermination()
& when you run it, do it like so:
spark-submit --master local[8] --packages org.apache.spark:spark-streaming-kinesis-asl_2.12:3.0.0-preview ./streaming.py
2.12 -> refers to the scala version
3.0.0 -> refers to the spark version
Go here and make sure you select the correct params for that package

ClassNotFoundException in SparkStreaming Example

I am new to Spark streaming and trying to run a example from this tutorial and I am following MAKING AND RUNNING OUR OWN NETWORKWORDCOUNT.
I have completed 8th step and made a jar from sbt.
Now I am trying to run deploy my jar using the command in 9th step like this:
bin/spark-submit --class "NetworkWordCount" --master spark://abc:7077 target/scala-2.11/networkcount_2.11-1.0.jar localhost 9999
but when I run this command I get following exception:
java.lang.ClassNotFoundException: NetworkWordCount
at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at
java.lang.ClassLoader.loadClass(ClassLoader.java:424) at
java.lang.ClassLoader.loadClass(ClassLoader.java:357) at
java.lang.Class.forName0(Native Method) at
java.lang.Class.forName(Class.java:348) at
org.apache.spark.util.Utils$.classForName(Utils.scala:229) at
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:700)
at
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
jar that I have created contains "NetworkWordCount" class having the following code from the spark examples
package src.main.scala
import org.apache.spark.SparkConf
import org.apache.spark.storage.StorageLevel
import org.apache.spark.streaming.{Seconds, StreamingContext}
object NetworkWordCount {
def main(args: Array[String]) {
if (args.length < 2) {
System.err.println("Usage: NetworkWordCount <hostname> <port>")
System.exit(1)
}
//StreamingExamples.setStreamingLogLevels()
// Create the context with a 1 second batch size
val sparkConf = new SparkConf().setAppName("MyNetworkWordCount")
val ssc = new StreamingContext(sparkConf, Seconds(1))
val lines = ssc.socketTextStream(args(0), args(1).toInt, StorageLevel.MEMORY_AND_DISK_SER)
val words = lines.flatMap(_.split(" "))
val wordCounts = words.map(x => (x, 1)).reduceByKey(_ + _)
wordCounts.print()
ssc.start()
ssc.awaitTermination()
}
}
I am unable to identify what am I doing wrong.
The spark-submit parameter --class takes a fully qualified class name.
In the case of the code above, it should be src.main.scala.NetworkCount
bin/spark-submit --class src.main.scala.NetworkCount --master spark://abc:7077 target/scala-2.11/networkcount_2.11-1.0.jar localhost 9999
Note: the package name used looks like an IDE setup issue. src/main/scala is the typical root for a scala code base, and not a package name.
make sure you have the "target/scala-2.11/networkcount_2.11-1.0.jar" file in your current dir when executing spark-submit

Spark isn't loading, what's up with that?

So the problem that I am having is that I don't seem to be able to create a sparkcontext. And I have no idea why not.
Here is my code:
import org.apache.spark.{SparkConf, SparkContext}
object spark_test{
def main(args: Array[String]): Unit = {
val conf = new SparkConf().setAppName("Datasets Test").setMaster("local")
val sc= new SparkContext(conf)
println(sc)
}
}
And here is the result that I am getting:
Exception in thread "main" java.lang.NoSuchMethodError: scala.Predef$.refArrayOps([Ljava/lang/Object;)Lscala/collection/mutable/ArrayOps;
at org.apache.spark.SparkConf.getAkkaConf(SparkConf.scala:203)
at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:68)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:126)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:139)
at spark_test$.main(test.scala:6)
at spark_test.main(test.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)
Any thoughts?
Your scala version is too new and spark-core version is too old . I am using scala 2.11.8 and spark-core_2.11:2.0.1,you can try it!

Flink Kafka connector error in a Maven Scala project, using Intellij, Kafka 0.8.2, Java 7 and Scala 2.10

I tried to run the following code.
package test
import java.util.Properties
import org.apache.flink.streaming.api.scala.{StreamExecutionEnvironment, _}
import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer08
import org.apache.flink.streaming.util.serialization.SimpleStringSchema
object FlinkKafkaStreaming {
def main(args: Array[String]) {
val env = StreamExecutionEnvironment.getExecutionEnvironment
env.enableCheckpointing(5000)
val properties = new Properties()
properties.setProperty("bootstrap.servers", "www.iteblog.com:9092")
// only required for Kafka 0.8
properties.setProperty("zookeeper.connect", "www.iteblog.com:2181")
properties.setProperty("group.id", "iteblog")
val stream = env.addSource(new FlinkKafkaConsumer08[String]("iteblog",
new SimpleStringSchema(), properties))
stream.setParallelism(4).writeAsText("hdfs:///tmp/iteblog/data")
env.execute("IteblogFlinkKafkaStreaming")
}
}
But got the following error
/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/bin/java -Didea.launcher.port=7533 "-Didea.launcher.bin.path=/Applications/IntelliJ IDEA CE.app/Contents/bin" -Dfile.encoding=UTF-8 -classpath "/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/charsets.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/deploy.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/ext/dnsns.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/ext/localedata.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/ext/sunec.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/ext/sunjce_provider.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/ext/sunpkcs11.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/ext/zipfs.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/htmlconverter.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/javaws.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/jce.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/jfr.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/jfxrt.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/jsse.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/management-agent.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/plugin.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/resources.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/rt.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/lib/ant-javafx.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/lib/dt.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/lib/javafx-doclet.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/lib/javafx-mx.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/lib/jconsole.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/lib/sa-jdi.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/lib/tools.jar:/Users/zhenhao.li/ethan-stream/target/classes:/Users/zhenhao.li/.m2/repository/org/apache/flink/flink-scala_2.10/1.0.2/flink-scala_2.10-1.0.2.jar:/Users/zhenhao.li/.m2/repository/org/apache/flink/flink-core/1.0.2/flink-core-1.0.2.jar:/Users/zhenhao.li/.m2/repository/org/apache/flink/flink-annotations/1.0.2/flink-annotations-1.0.2.jar:/Users/zhenhao.li/.m2/repository/org/apache/commons/commons-lang3/3.3.2/commons-lang3-3.3.2.jar:/Users/zhenhao.li/.m2/repository/org/slf4j/slf4j-api/1.7.7/slf4j-api-1.7.7.jar:/Users/zhenhao.li/.m2/repository/org/slf4j/slf4j-log4j12/1.7.7/slf4j-log4j12-1.7.7.jar:/Users/zhenhao.li/.m2/repository/log4j/log4j/1.2.17/log4j-1.2.17.jar:/Users/zhenhao.li/.m2/repository/org/apache/flink/force-shading/1.0.2/force-shading-1.0.2.jar:/Users/zhenhao.li/.m2/repository/com/esotericsoftware/kryo/kryo/2.24.0/kryo-2.24.0.jar:/Users/zhenhao.li/.m2/repository/com/esotericsoftware/minlog/minlog/1.2/minlog-1.2.jar:/Users/zhenhao.li/.m2/repository/org/objenesis/objenesis/2.1/objenesis-2.1.jar:/Users/zhenhao.li/.m2/repository/org/apache/avro/avro/1.7.6/avro-1.7.6.jar:/Users/zhenhao.li/.m2/repository/org/codehaus/jackson/jackson-core-asl/1.9.13/jackson-core-asl-1.9.13.jar:/Users/zhenhao.li/.m2/repository/org/codehaus/jackson/jackson-mapper-asl/1.9.13/jackson-mapper-asl-1.9.13.jar:/Users/zhenhao.li/.m2/repository/com/thoughtworks/paranamer/paranamer/2.3/paranamer-2.3.jar:/Users/zhenhao.li/.m2/repository/org/apache/flink/flink-shaded-hadoop2/1.0.2/flink-shaded-hadoop2-1.0.2.jar:/Users/zhenhao.li/.m2/repository/commons-cli/commons-cli/1.2/commons-cli-1.2.jar:/Users/zhenhao.li/.m2/repository/org/apache/commons/commons-math3/3.5/commons-math3-3.5.jar:/Users/zhenhao.li/.m2/repository/xmlenc/xmlenc/0.52/xmlenc-0.52.jar:/Users/zhenhao.li/.m2/repository/commons-codec/commons-codec/1.4/commons-codec-1.4.jar:/Users/zhenhao.li/.m2/repository/commons-io/commons-io/2.4/commons-io-2.4.jar:/Users/zhenhao.li/.m2/repository/commons-net/commons-net/3.1/commons-net-3.1.jar:/Users/zhenhao.li/.m2/repository/commons-collections/commons-collections/3.2.1/commons-collections-3.2.1.jar:/Users/zhenhao.li/.m2/repository/javax/servlet/servlet-api/2.5/servlet-api-2.5.jar:/Users/zhenhao.li/.m2/repository/org/mortbay/jetty/jetty-util/6.1.26/jetty-util-6.1.26.jar:/Users/zhenhao.li/.m2/repository/com/sun/jersey/jersey-core/1.9/jersey-core-1.9.jar:/Users/zhenhao.li/.m2/repository/commons-el/commons-el/1.0/commons-el-1.0.jar:/Users/zhenhao.li/.m2/repository/commons-logging/commons-logging/1.1.3/commons-logging-1.1.3.jar:/Users/zhenhao.li/.m2/repository/com/jamesmurty/utils/java-xmlbuilder/0.4/java-xmlbuilder-0.4.jar:/Users/zhenhao.li/.m2/repository/commons-lang/commons-lang/2.6/commons-lang-2.6.jar:/Users/zhenhao.li/.m2/repository/commons-configuration/commons-configuration/1.7/commons-configuration-1.7.jar:/Users/zhenhao.li/.m2/repository/commons-digester/commons-digester/1.8.1/commons-digester-1.8.1.jar:/Users/zhenhao.li/.m2/repository/org/xerial/snappy/snappy-java/1.0.5/snappy-java-1.0.5.jar:/Users/zhenhao.li/.m2/repository/com/jcraft/jsch/0.1.42/jsch-0.1.42.jar:/Users/zhenhao.li/.m2/repository/org/apache/zookeeper/zookeeper/3.4.6/zookeeper-3.4.6.jar:/Users/zhenhao.li/.m2/repository/org/apache/commons/commons-compress/1.4.1/commons-compress-1.4.1.jar:/Users/zhenhao.li/.m2/repository/org/tukaani/xz/1.0/xz-1.0.jar:/Users/zhenhao.li/.m2/repository/commons-beanutils/commons-beanutils-bean-collections/1.8.3/commons-beanutils-bean-collections-1.8.3.jar:/Users/zhenhao.li/.m2/repository/commons-daemon/commons-daemon/1.0.13/commons-daemon-1.0.13.jar:/Users/zhenhao.li/.m2/repository/javax/xml/bind/jaxb-api/2.2.2/jaxb-api-2.2.2.jar:/Users/zhenhao.li/.m2/repository/javax/xml/stream/stax-api/1.0-2/stax-api-1.0-2.jar:/Users/zhenhao.li/.m2/repository/javax/activation/activation/1.1/activation-1.1.jar:/Users/zhenhao.li/.m2/repository/com/google/inject/guice/3.0/guice-3.0.jar:/Users/zhenhao.li/.m2/repository/javax/inject/javax.inject/1/javax.inject-1.jar:/Users/zhenhao.li/.m2/repository/aopalliance/aopalliance/1.0/aopalliance-1.0.jar:/Users/zhenhao.li/.m2/repository/org/apache/flink/flink-java/1.0.2/flink-java-1.0.2.jar:/Users/zhenhao.li/.m2/repository/org/apache/flink/flink-optimizer_2.10/1.0.2/flink-optimizer_2.10-1.0.2.jar:/Users/zhenhao.li/.m2/repository/org/apache/flink/flink-runtime_2.10/1.0.2/flink-runtime_2.10-1.0.2.jar:/Users/zhenhao.li/.m2/repository/org/scala-lang/scala-reflect/2.10.4/scala-reflect-2.10.4.jar:/Users/zhenhao.li/.m2/repository/org/scala-lang/scala-library/2.10.4/scala-library-2.10.4.jar:/Users/zhenhao.li/.m2/repository/org/scala-lang/scala-compiler/2.10.4/scala-compiler-2.10.4.jar:/Users/zhenhao.li/.m2/repository/org/scalamacros/quasiquotes_2.10/2.0.1/quasiquotes_2.10-2.0.1.jar:/Users/zhenhao.li/.m2/repository/org/apache/flink/flink-streaming-scala_2.10/1.0.2/flink-streaming-scala_2.10-1.0.2.jar:/Users/zhenhao.li/.m2/repository/org/apache/flink/flink-streaming-java_2.10/1.0.2/flink-streaming-java_2.10-1.0.2.jar:/Users/zhenhao.li/.m2/repository/org/apache/flink/flink-clients_2.10/1.0.2/flink-clients_2.10-1.0.2.jar:/Users/zhenhao.li/.m2/repository/org/apache/commons/commons-math/2.2/commons-math-2.2.jar:/Users/zhenhao.li/.m2/repository/org/apache/sling/org.apache.sling.commons.json/2.0.6/org.apache.sling.commons.json-2.0.6.jar:/Users/zhenhao.li/.m2/repository/io/netty/netty-all/4.0.27.Final/netty-all-4.0.27.Final.jar:/Users/zhenhao.li/.m2/repository/org/javassist/javassist/3.18.2-GA/javassist-3.18.2-GA.jar:/Users/zhenhao.li/.m2/repository/com/typesafe/akka/akka-actor_2.10/2.3.7/akka-actor_2.10-2.3.7.jar:/Users/zhenhao.li/.m2/repository/com/typesafe/config/1.2.1/config-1.2.1.jar:/Users/zhenhao.li/.m2/repository/com/typesafe/akka/akka-remote_2.10/2.3.7/akka-remote_2.10-2.3.7.jar:/Users/zhenhao.li/.m2/repository/io/netty/netty/3.8.0.Final/netty-3.8.0.Final.jar:/Users/zhenhao.li/.m2/repository/com/google/protobuf/protobuf-java/2.5.0/protobuf-java-2.5.0.jar:/Users/zhenhao.li/.m2/repository/org/uncommons/maths/uncommons-maths/1.2.2a/uncommons-maths-1.2.2a.jar:/Users/zhenhao.li/.m2/repository/com/typesafe/akka/akka-slf4j_2.10/2.3.7/akka-slf4j_2.10-2.3.7.jar:/Users/zhenhao.li/.m2/repository/org/clapper/grizzled-slf4j_2.10/1.0.2/grizzled-slf4j_2.10-1.0.2.jar:/Users/zhenhao.li/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar:/Users/zhenhao.li/.m2/repository/io/dropwizard/metrics/metrics-core/3.1.0/metrics-core-3.1.0.jar:/Users/zhenhao.li/.m2/repository/io/dropwizard/metrics/metrics-jvm/3.1.0/metrics-jvm-3.1.0.jar:/Users/zhenhao.li/.m2/repository/io/dropwizard/metrics/metrics-json/3.1.0/metrics-json-3.1.0.jar:/Users/zhenhao.li/.m2/repository/com/fasterxml/jackson/core/jackson-databind/2.4.2/jackson-databind-2.4.2.jar:/Users/zhenhao.li/.m2/repository/com/fasterxml/jackson/core/jackson-annotations/2.4.0/jackson-annotations-2.4.0.jar:/Users/zhenhao.li/.m2/repository/com/fasterxml/jackson/core/jackson-core/2.4.2/jackson-core-2.4.2.jar:/Users/zhenhao.li/.m2/repository/jline/jline/0.9.94/jline-0.9.94.jar:/Users/zhenhao.li/.m2/repository/junit/junit/3.8.1/junit-3.8.1.jar:/Users/zhenhao.li/.m2/repository/com/twitter/chill_2.10/0.7.4/chill_2.10-0.7.4.jar:/Users/zhenhao.li/.m2/repository/com/twitter/chill-java/0.7.4/chill-java-0.7.4.jar:/Users/zhenhao.li/.m2/repository/org/apache/flink/flink-connector-kafka-0.8_2.10/1.1-SNAPSHOT/flink-connector-kafka-0.8_2.10-1.1-20160514.040356-150.jar:/Users/zhenhao.li/.m2/repository/org/apache/flink/flink-connector-kafka-base_2.10/1.1-SNAPSHOT/flink-connector-kafka-base_2.10-1.1-20160514.040350-150.jar:/Users/zhenhao.li/.m2/repository/org/apache/kafka/kafka_2.10/0.8.2.2/kafka_2.10-0.8.2.2.jar:/Users/zhenhao.li/.m2/repository/com/101tec/zkclient/0.7/zkclient-0.7.jar:/Users/zhenhao.li/.m2/repository/com/google/code/findbugs/jsr305/1.3.9/jsr305-1.3.9.jar:/Users/zhenhao.li/.m2/repository/org/apache/kafka/kafka-clients/0.8.2.2/kafka-clients-0.8.2.2.jar:/Users/zhenhao.li/.m2/repository/net/jpountz/lz4/lz4/1.2.0/lz4-1.2.0.jar:/Users/zhenhao.li/.m2/repository/com/yammer/metrics/metrics-core/2.2.0/metrics-core-2.2.0.jar:/Applications/IntelliJ IDEA CE.app/Contents/lib/idea_rt.jar" com.intellij.rt.execution.application.AppMain com.sky.ethan.stream.example.FlinkKafkaStreaming
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/flink/util/Preconditions
at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.<init>(FlinkKafkaConsumerBase.java:113)
at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer08.<init>(FlinkKafkaConsumer08.java:180)
at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer08.<init>(FlinkKafkaConsumer08.java:164)
at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer08.<init>(FlinkKafkaConsumer08.java:131)
at com.sky.ethan.stream.example.FlinkKafkaStreaming$.main(Example.scala:32)
at com.sky.ethan.stream.example.FlinkKafkaStreaming.main(Example.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
Caused by: java.lang.ClassNotFoundException: org.apache.flink.util.Preconditions
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 11 more
Process finished with exit code 1
I played around it and found out that I couldn't call the constructor FlinkKafkaConsumer08. I used Kafka 0.8.2, Java 7 and Scala 2.10.
What might be wrong here?
It looks like a version mismatch between the version you compiled your program with and the version you use to run your program. Could you make sure that it is the same version?

Task Not Serializable exception when using IgniteRDD

What is wrong with this code?? I can not escape from Task Not Serializable
#throws(classOf[Exception])
override def setUp(cfg: BenchmarkConfiguration) {
super.setUp(cfg)
sc = new SparkContext("local[4]", "BenchmarkTest")
sqlContext = new HiveContext(sc)
ic = new IgniteContext[RddKey, RddVal](sc,
() ⇒ configuration("client", client = true))
icCache = ic.fromCache(PARTITIONED_CACHE_NAME)
icCache.savePairs( sc.parallelize({
(0 until 1000).map{ n => (n.toLong, s"Value for key $n")}
}, 10)) // Error happens here: this is "line 89"
println(icCache.collect)
}
Here is the ST:
<20:47:45><yardstick> Failed to start benchmark server (will stop and exit).
org.apache.spark.SparkException: Task not serializable
at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:166)
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:158)
at org.apache.spark.SparkContext.clean(SparkContext.scala:1623)
at org.apache.spark.rdd.RDD.foreachPartition(RDD.scala:805)
at org.apache.ignite.spark.IgniteRDD.savePairs(IgniteRDD.scala:170)
at org.yardstickframework.spark.SparkAbstractBenchmark.setUp(SparkAbstractBenchmark.scala:89)
at org.yardstickframework.spark.SparkCoreRDDBenchmark.setUp(SparkCoreRDDBenchmark.scala:18)
at org.yardstickframework.spark.SparkCoreRDDBenchmark$.main(SparkCoreRDDBenchmark.scala:72)
at org.yardstickframework.spark.SparkNode.start(SparkNode.scala:28)
at org.yardstickframework.BenchmarkServerStartUp.main(BenchmarkServerStartUp.java:61)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.serializer.SerializationDebugger$ObjectStreamClassMethods$.getObjFieldValues$extension(SerializationDebugger.scala:240)
It looks like your code is compiled against a different version of scala than
the ignite or spark modules were compiled. I got similar exceptions while
testing when my code was compiled against scala 2.10 and spark was running
scala 2.11 or vice-versa. Module com.databricks:spark-csv_2.10:1.1.0 might
be the reason for this.