Task Not Serializable exception when using IgniteRDD - scala

What is wrong with this code?? I can not escape from Task Not Serializable
#throws(classOf[Exception])
override def setUp(cfg: BenchmarkConfiguration) {
super.setUp(cfg)
sc = new SparkContext("local[4]", "BenchmarkTest")
sqlContext = new HiveContext(sc)
ic = new IgniteContext[RddKey, RddVal](sc,
() ⇒ configuration("client", client = true))
icCache = ic.fromCache(PARTITIONED_CACHE_NAME)
icCache.savePairs( sc.parallelize({
(0 until 1000).map{ n => (n.toLong, s"Value for key $n")}
}, 10)) // Error happens here: this is "line 89"
println(icCache.collect)
}
Here is the ST:
<20:47:45><yardstick> Failed to start benchmark server (will stop and exit).
org.apache.spark.SparkException: Task not serializable
at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:166)
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:158)
at org.apache.spark.SparkContext.clean(SparkContext.scala:1623)
at org.apache.spark.rdd.RDD.foreachPartition(RDD.scala:805)
at org.apache.ignite.spark.IgniteRDD.savePairs(IgniteRDD.scala:170)
at org.yardstickframework.spark.SparkAbstractBenchmark.setUp(SparkAbstractBenchmark.scala:89)
at org.yardstickframework.spark.SparkCoreRDDBenchmark.setUp(SparkCoreRDDBenchmark.scala:18)
at org.yardstickframework.spark.SparkCoreRDDBenchmark$.main(SparkCoreRDDBenchmark.scala:72)
at org.yardstickframework.spark.SparkNode.start(SparkNode.scala:28)
at org.yardstickframework.BenchmarkServerStartUp.main(BenchmarkServerStartUp.java:61)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.serializer.SerializationDebugger$ObjectStreamClassMethods$.getObjFieldValues$extension(SerializationDebugger.scala:240)

It looks like your code is compiled against a different version of scala than
the ignite or spark modules were compiled. I got similar exceptions while
testing when my code was compiled against scala 2.10 and spark was running
scala 2.11 or vice-versa. Module com.databricks:spark-csv_2.10:1.1.0 might
be the reason for this.

Related

java.lang.NoClassDefFoundError issue while running a scala code for a UDF

I am trying to write a UDF in scala in order to get all the months between two dates passed. This is what i have written.
package com.company.datediff
import org.apache.hadoop.hive.ql.exec.UDF
import java.time._
class hive_udf extends UDF {
def evaluate(date1: String, date2: String): String = {
val s1 = LocalDate.parse(date1)
val s2= LocalDate.parse(date2)
val p = Period.between(s1, s2)
val l=p.getMonths()
val min1= s1.getMonthValue()
val max1= s2.getMonthValue()
var arr1=""
for (i <- min1 to max1){
arr1=arr1.concat(","+ i)
}
/*var i=min1
while (i<= max1){
arr1=arr1.concat(","+ i)
}*/
return arr1
}
}
When running this code without for loop, code runs perfectly fine. After inclusion of for loop, I am getting 'java.lang.NoClassDefFoundError' and
Execution Error, return code -101 from
'org.apache.hadoop.hive.ql.exec.FunctionTask. scala/Function1'
PFB the details of full error:
java.lang.NoClassDefFoundError: scala/Function1
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.hadoop.hive.ql.exec.Registry.registerToSessionRegistry(Registry.java:518)
at org.apache.hadoop.hive.ql.exec.Registry.registerPermanentFunction(Registry.java:207)
at org.apache.hadoop.hive.ql.exec.FunctionRegistry.registerPermanentFunction(FunctionRegistry.java:1536)
at org.apache.hadoop.hive.ql.exec.FunctionTask.createPermanentFunction(FunctionTask.java:136)
at org.apache.hadoop.hive.ql.exec.FunctionTask.execute(FunctionTask.java:75)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1748)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1494)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1291)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1158)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1148)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:217)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:169)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:380)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:740)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:685)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by: java.lang.ClassNotFoundException: scala.Function1
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 26 more
FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.FunctionTask. scala/Function1
I have a limited exposure to java or Scala. Not sure where I am going wrong. Any help is appreciated. Thanks

Flink Kafka connector error in a Maven Scala project, using Intellij, Kafka 0.8.2, Java 7 and Scala 2.10

I tried to run the following code.
package test
import java.util.Properties
import org.apache.flink.streaming.api.scala.{StreamExecutionEnvironment, _}
import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer08
import org.apache.flink.streaming.util.serialization.SimpleStringSchema
object FlinkKafkaStreaming {
def main(args: Array[String]) {
val env = StreamExecutionEnvironment.getExecutionEnvironment
env.enableCheckpointing(5000)
val properties = new Properties()
properties.setProperty("bootstrap.servers", "www.iteblog.com:9092")
// only required for Kafka 0.8
properties.setProperty("zookeeper.connect", "www.iteblog.com:2181")
properties.setProperty("group.id", "iteblog")
val stream = env.addSource(new FlinkKafkaConsumer08[String]("iteblog",
new SimpleStringSchema(), properties))
stream.setParallelism(4).writeAsText("hdfs:///tmp/iteblog/data")
env.execute("IteblogFlinkKafkaStreaming")
}
}
But got the following error
/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/bin/java -Didea.launcher.port=7533 "-Didea.launcher.bin.path=/Applications/IntelliJ IDEA CE.app/Contents/bin" -Dfile.encoding=UTF-8 -classpath "/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/charsets.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/deploy.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/ext/dnsns.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/ext/localedata.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/ext/sunec.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/ext/sunjce_provider.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/ext/sunpkcs11.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/ext/zipfs.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/htmlconverter.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/javaws.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/jce.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/jfr.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/jfxrt.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/jsse.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/management-agent.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/plugin.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/resources.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/jre/lib/rt.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/lib/ant-javafx.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/lib/dt.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/lib/javafx-doclet.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/lib/javafx-mx.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/lib/jconsole.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/lib/sa-jdi.jar:/Library/Java/JavaVirtualMachines/jdk1.7.0_80.jdk/Contents/Home/lib/tools.jar:/Users/zhenhao.li/ethan-stream/target/classes:/Users/zhenhao.li/.m2/repository/org/apache/flink/flink-scala_2.10/1.0.2/flink-scala_2.10-1.0.2.jar:/Users/zhenhao.li/.m2/repository/org/apache/flink/flink-core/1.0.2/flink-core-1.0.2.jar:/Users/zhenhao.li/.m2/repository/org/apache/flink/flink-annotations/1.0.2/flink-annotations-1.0.2.jar:/Users/zhenhao.li/.m2/repository/org/apache/commons/commons-lang3/3.3.2/commons-lang3-3.3.2.jar:/Users/zhenhao.li/.m2/repository/org/slf4j/slf4j-api/1.7.7/slf4j-api-1.7.7.jar:/Users/zhenhao.li/.m2/repository/org/slf4j/slf4j-log4j12/1.7.7/slf4j-log4j12-1.7.7.jar:/Users/zhenhao.li/.m2/repository/log4j/log4j/1.2.17/log4j-1.2.17.jar:/Users/zhenhao.li/.m2/repository/org/apache/flink/force-shading/1.0.2/force-shading-1.0.2.jar:/Users/zhenhao.li/.m2/repository/com/esotericsoftware/kryo/kryo/2.24.0/kryo-2.24.0.jar:/Users/zhenhao.li/.m2/repository/com/esotericsoftware/minlog/minlog/1.2/minlog-1.2.jar:/Users/zhenhao.li/.m2/repository/org/objenesis/objenesis/2.1/objenesis-2.1.jar:/Users/zhenhao.li/.m2/repository/org/apache/avro/avro/1.7.6/avro-1.7.6.jar:/Users/zhenhao.li/.m2/repository/org/codehaus/jackson/jackson-core-asl/1.9.13/jackson-core-asl-1.9.13.jar:/Users/zhenhao.li/.m2/repository/org/codehaus/jackson/jackson-mapper-asl/1.9.13/jackson-mapper-asl-1.9.13.jar:/Users/zhenhao.li/.m2/repository/com/thoughtworks/paranamer/paranamer/2.3/paranamer-2.3.jar:/Users/zhenhao.li/.m2/repository/org/apache/flink/flink-shaded-hadoop2/1.0.2/flink-shaded-hadoop2-1.0.2.jar:/Users/zhenhao.li/.m2/repository/commons-cli/commons-cli/1.2/commons-cli-1.2.jar:/Users/zhenhao.li/.m2/repository/org/apache/commons/commons-math3/3.5/commons-math3-3.5.jar:/Users/zhenhao.li/.m2/repository/xmlenc/xmlenc/0.52/xmlenc-0.52.jar:/Users/zhenhao.li/.m2/repository/commons-codec/commons-codec/1.4/commons-codec-1.4.jar:/Users/zhenhao.li/.m2/repository/commons-io/commons-io/2.4/commons-io-2.4.jar:/Users/zhenhao.li/.m2/repository/commons-net/commons-net/3.1/commons-net-3.1.jar:/Users/zhenhao.li/.m2/repository/commons-collections/commons-collections/3.2.1/commons-collections-3.2.1.jar:/Users/zhenhao.li/.m2/repository/javax/servlet/servlet-api/2.5/servlet-api-2.5.jar:/Users/zhenhao.li/.m2/repository/org/mortbay/jetty/jetty-util/6.1.26/jetty-util-6.1.26.jar:/Users/zhenhao.li/.m2/repository/com/sun/jersey/jersey-core/1.9/jersey-core-1.9.jar:/Users/zhenhao.li/.m2/repository/commons-el/commons-el/1.0/commons-el-1.0.jar:/Users/zhenhao.li/.m2/repository/commons-logging/commons-logging/1.1.3/commons-logging-1.1.3.jar:/Users/zhenhao.li/.m2/repository/com/jamesmurty/utils/java-xmlbuilder/0.4/java-xmlbuilder-0.4.jar:/Users/zhenhao.li/.m2/repository/commons-lang/commons-lang/2.6/commons-lang-2.6.jar:/Users/zhenhao.li/.m2/repository/commons-configuration/commons-configuration/1.7/commons-configuration-1.7.jar:/Users/zhenhao.li/.m2/repository/commons-digester/commons-digester/1.8.1/commons-digester-1.8.1.jar:/Users/zhenhao.li/.m2/repository/org/xerial/snappy/snappy-java/1.0.5/snappy-java-1.0.5.jar:/Users/zhenhao.li/.m2/repository/com/jcraft/jsch/0.1.42/jsch-0.1.42.jar:/Users/zhenhao.li/.m2/repository/org/apache/zookeeper/zookeeper/3.4.6/zookeeper-3.4.6.jar:/Users/zhenhao.li/.m2/repository/org/apache/commons/commons-compress/1.4.1/commons-compress-1.4.1.jar:/Users/zhenhao.li/.m2/repository/org/tukaani/xz/1.0/xz-1.0.jar:/Users/zhenhao.li/.m2/repository/commons-beanutils/commons-beanutils-bean-collections/1.8.3/commons-beanutils-bean-collections-1.8.3.jar:/Users/zhenhao.li/.m2/repository/commons-daemon/commons-daemon/1.0.13/commons-daemon-1.0.13.jar:/Users/zhenhao.li/.m2/repository/javax/xml/bind/jaxb-api/2.2.2/jaxb-api-2.2.2.jar:/Users/zhenhao.li/.m2/repository/javax/xml/stream/stax-api/1.0-2/stax-api-1.0-2.jar:/Users/zhenhao.li/.m2/repository/javax/activation/activation/1.1/activation-1.1.jar:/Users/zhenhao.li/.m2/repository/com/google/inject/guice/3.0/guice-3.0.jar:/Users/zhenhao.li/.m2/repository/javax/inject/javax.inject/1/javax.inject-1.jar:/Users/zhenhao.li/.m2/repository/aopalliance/aopalliance/1.0/aopalliance-1.0.jar:/Users/zhenhao.li/.m2/repository/org/apache/flink/flink-java/1.0.2/flink-java-1.0.2.jar:/Users/zhenhao.li/.m2/repository/org/apache/flink/flink-optimizer_2.10/1.0.2/flink-optimizer_2.10-1.0.2.jar:/Users/zhenhao.li/.m2/repository/org/apache/flink/flink-runtime_2.10/1.0.2/flink-runtime_2.10-1.0.2.jar:/Users/zhenhao.li/.m2/repository/org/scala-lang/scala-reflect/2.10.4/scala-reflect-2.10.4.jar:/Users/zhenhao.li/.m2/repository/org/scala-lang/scala-library/2.10.4/scala-library-2.10.4.jar:/Users/zhenhao.li/.m2/repository/org/scala-lang/scala-compiler/2.10.4/scala-compiler-2.10.4.jar:/Users/zhenhao.li/.m2/repository/org/scalamacros/quasiquotes_2.10/2.0.1/quasiquotes_2.10-2.0.1.jar:/Users/zhenhao.li/.m2/repository/org/apache/flink/flink-streaming-scala_2.10/1.0.2/flink-streaming-scala_2.10-1.0.2.jar:/Users/zhenhao.li/.m2/repository/org/apache/flink/flink-streaming-java_2.10/1.0.2/flink-streaming-java_2.10-1.0.2.jar:/Users/zhenhao.li/.m2/repository/org/apache/flink/flink-clients_2.10/1.0.2/flink-clients_2.10-1.0.2.jar:/Users/zhenhao.li/.m2/repository/org/apache/commons/commons-math/2.2/commons-math-2.2.jar:/Users/zhenhao.li/.m2/repository/org/apache/sling/org.apache.sling.commons.json/2.0.6/org.apache.sling.commons.json-2.0.6.jar:/Users/zhenhao.li/.m2/repository/io/netty/netty-all/4.0.27.Final/netty-all-4.0.27.Final.jar:/Users/zhenhao.li/.m2/repository/org/javassist/javassist/3.18.2-GA/javassist-3.18.2-GA.jar:/Users/zhenhao.li/.m2/repository/com/typesafe/akka/akka-actor_2.10/2.3.7/akka-actor_2.10-2.3.7.jar:/Users/zhenhao.li/.m2/repository/com/typesafe/config/1.2.1/config-1.2.1.jar:/Users/zhenhao.li/.m2/repository/com/typesafe/akka/akka-remote_2.10/2.3.7/akka-remote_2.10-2.3.7.jar:/Users/zhenhao.li/.m2/repository/io/netty/netty/3.8.0.Final/netty-3.8.0.Final.jar:/Users/zhenhao.li/.m2/repository/com/google/protobuf/protobuf-java/2.5.0/protobuf-java-2.5.0.jar:/Users/zhenhao.li/.m2/repository/org/uncommons/maths/uncommons-maths/1.2.2a/uncommons-maths-1.2.2a.jar:/Users/zhenhao.li/.m2/repository/com/typesafe/akka/akka-slf4j_2.10/2.3.7/akka-slf4j_2.10-2.3.7.jar:/Users/zhenhao.li/.m2/repository/org/clapper/grizzled-slf4j_2.10/1.0.2/grizzled-slf4j_2.10-1.0.2.jar:/Users/zhenhao.li/.m2/repository/com/github/scopt/scopt_2.10/3.2.0/scopt_2.10-3.2.0.jar:/Users/zhenhao.li/.m2/repository/io/dropwizard/metrics/metrics-core/3.1.0/metrics-core-3.1.0.jar:/Users/zhenhao.li/.m2/repository/io/dropwizard/metrics/metrics-jvm/3.1.0/metrics-jvm-3.1.0.jar:/Users/zhenhao.li/.m2/repository/io/dropwizard/metrics/metrics-json/3.1.0/metrics-json-3.1.0.jar:/Users/zhenhao.li/.m2/repository/com/fasterxml/jackson/core/jackson-databind/2.4.2/jackson-databind-2.4.2.jar:/Users/zhenhao.li/.m2/repository/com/fasterxml/jackson/core/jackson-annotations/2.4.0/jackson-annotations-2.4.0.jar:/Users/zhenhao.li/.m2/repository/com/fasterxml/jackson/core/jackson-core/2.4.2/jackson-core-2.4.2.jar:/Users/zhenhao.li/.m2/repository/jline/jline/0.9.94/jline-0.9.94.jar:/Users/zhenhao.li/.m2/repository/junit/junit/3.8.1/junit-3.8.1.jar:/Users/zhenhao.li/.m2/repository/com/twitter/chill_2.10/0.7.4/chill_2.10-0.7.4.jar:/Users/zhenhao.li/.m2/repository/com/twitter/chill-java/0.7.4/chill-java-0.7.4.jar:/Users/zhenhao.li/.m2/repository/org/apache/flink/flink-connector-kafka-0.8_2.10/1.1-SNAPSHOT/flink-connector-kafka-0.8_2.10-1.1-20160514.040356-150.jar:/Users/zhenhao.li/.m2/repository/org/apache/flink/flink-connector-kafka-base_2.10/1.1-SNAPSHOT/flink-connector-kafka-base_2.10-1.1-20160514.040350-150.jar:/Users/zhenhao.li/.m2/repository/org/apache/kafka/kafka_2.10/0.8.2.2/kafka_2.10-0.8.2.2.jar:/Users/zhenhao.li/.m2/repository/com/101tec/zkclient/0.7/zkclient-0.7.jar:/Users/zhenhao.li/.m2/repository/com/google/code/findbugs/jsr305/1.3.9/jsr305-1.3.9.jar:/Users/zhenhao.li/.m2/repository/org/apache/kafka/kafka-clients/0.8.2.2/kafka-clients-0.8.2.2.jar:/Users/zhenhao.li/.m2/repository/net/jpountz/lz4/lz4/1.2.0/lz4-1.2.0.jar:/Users/zhenhao.li/.m2/repository/com/yammer/metrics/metrics-core/2.2.0/metrics-core-2.2.0.jar:/Applications/IntelliJ IDEA CE.app/Contents/lib/idea_rt.jar" com.intellij.rt.execution.application.AppMain com.sky.ethan.stream.example.FlinkKafkaStreaming
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/flink/util/Preconditions
at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.<init>(FlinkKafkaConsumerBase.java:113)
at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer08.<init>(FlinkKafkaConsumer08.java:180)
at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer08.<init>(FlinkKafkaConsumer08.java:164)
at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer08.<init>(FlinkKafkaConsumer08.java:131)
at com.sky.ethan.stream.example.FlinkKafkaStreaming$.main(Example.scala:32)
at com.sky.ethan.stream.example.FlinkKafkaStreaming.main(Example.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
Caused by: java.lang.ClassNotFoundException: org.apache.flink.util.Preconditions
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 11 more
Process finished with exit code 1
I played around it and found out that I couldn't call the constructor FlinkKafkaConsumer08. I used Kafka 0.8.2, Java 7 and Scala 2.10.
What might be wrong here?
It looks like a version mismatch between the version you compiled your program with and the version you use to run your program. Could you make sure that it is the same version?

saveAsNewAPIHadoopFile() giving error when used as output format

I am running a modified version of the teragen program in Spark, written in Scala. I am trying to save the output file using the function saveAsNewAPIHadoopFile(). The relevant code is given below:
dataset.map(row => (NullWritable.get(), new BytesWritable(row))).saveAsNewAPIHadoopFile(output)
The code is compiling successfully. However, when running it, I am getting the following error:
Exception in thread "main" java.lang.RuntimeException: class scala.runtime.Nothing$ not org.apache.hadoop.mapreduce.OutputFormat
at org.apache.hadoop.conf.Configuration.setClass(Configuration.java:1794)
at org.apache.hadoop.mapreduce.Job.setOutputFormatClass(Job.java:823)
at org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopFile(PairRDDFunctions.scala:830)
at org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopFile(PairRDDFunctions.scala:811)
at GenSort$.main(GenSort.scala:52)
at GenSort.main(GenSort.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:328)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Is there a way to make it work with saveAsNewAPIHadoopFile()? I would be glad for any help.
The saveAsNewAPIHadoopFile expect key, value, outformat classes.
Method signature is:
saveAsNewAPIHadoopFile(path: String,suffix: String,
keyClass: Class[_],
valueClass: Class[_],
outputFormatClass: Class[_ <: org.apache.hadoop.mapreduce.OutputFormat[_, _]])
Implementation should be:
dataset.map(row => (NullWritable.get(), new BytesWritable(row))).saveAsNewAPIHadoopFile("hdfs:\\.....","<suffix>",classOf[NullWritable],classOf[BytesWritable],classOf[org.apache.hadoop.mapreduce.lib.output.TextOutputFormat[NullWritable, BytesWritable]]))
or
dataset.map(row => (NullWritable.get(), new BytesWritable(row))).
saveAsNewAPIHadoopFile("hdfs:\\.....","<suffix>",
new NullWritable().getClass,new BytesWritable.getClass,
new org.apache.hadoop.mapreduce.lib.output.TextOutputFormat[NullWritable, BytesWritable].getClass))

scala code throw exception in spark

I am new to scala and spark. Today I tried to write some code, and let it run on spark, but got an exception.
this code work in local scala
import org.apache.commons.lang.time.StopWatch
import org.apache.spark.{SparkConf, SparkContext}
import scala.collection.mutable.ListBuffer
import scala.util.Random
def test(): List[Int] = {
val size = 100
val range = 100
var listBuffer = new ListBuffer[Int] // here throw an exception
val random = new Random()
for (i <- 1 to size)
listBuffer += random.nextInt(range)
listBuffer.foreach(x => println(x))
listBuffer.toList
}
but when I put this code into spark, it throw an exception says:
15/01/01 14:06:17 INFO SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
Exception in thread "main" java.lang.NoSuchMethodError: scala.runtime.ObjectRef.create(Ljava/lang/Object;)Lscala/runtime/ObjectRef;
at com.tudou.sortedspark.Sort$.test(Sort.scala:35)
at com.tudou.sortedspark.Sort$.sort(Sort.scala:23)
at com.tudou.sortedspark.Sort$.main(Sort.scala:14)
at com.tudou.sortedspark.Sort.main(Sort.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
if I comment out the below code, the code work in spark
for (i <- 1 to size)
can someone explain why, please.
Thanks #Imm, I have solved this issue. The root cause is that my local scala is 2.11.4, but my spark cluster is running at 1.2.0 version. The 1.2 version of spark was compiled by 2.10 scala.
So the solution is compile local code by 2.10 scala, and upload the compiled jar into spark. Everything works fine.

Can deserialize avros to Scala case-classes from in-memory, but why not from files? Record can't be cast to case class?

I'm trying to use Salat-Avro to serialize and deserialize Scala case classes.
I can serialize and deserialize fine in memory, but I can only serialize to files; I can't deserialize form file yet.
Why won't my DatumReader succeed when reading from a file like it did when reading from a stream?
[error] (run-main) java.lang.ExceptionInInitializerError
java.lang.ExceptionInInitializerError
at Main.main(salat-avro-example.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
Caused by: java.lang.ClassCastException: org.apache.avro.generic.GenericData$Record cannot be cast to models.Record
at Main$.<init>(salat-avro-example.scala:55)
at Main$.<clinit>(salat-avro-example.scala)
at Main.main(salat-avro-example.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
java.lang.RuntimeException: Nonzero exit code: 1
at scala.sys.package$.error(package.scala:27)
[error] {file:/home/julianpeeters/salat-avro-example/}default-7321ab/compile:run: Nonzero exit code: 1
[error] Total time: 18 s, completed Aug 30, 2012 12:04:01 AM
Here's the code:
val obj2 = grater[Record].asObjectFromDataFile(infile)
calls:
lazy val asDatumReader: AvroDatumReader[X] = asGenericDatumReader
lazy val asGenericDatumReader: AvroGenericDatumReader[X] = new AvroGenericDatumReader[X](asAvroSchema)def asObjectFromDataFile(infile: File): X = {
val asDataFileReader: DataFileReader[X] = new DataFileReader[X](infile, asDatumReader)
asDataFileReader.next()
} `
The code can also be seen at Github.com: Salat-Avro-Example.scala and
Salat-Avro.avrograter.scala
How do I fix this? Thanks!
Now I see that dataFileReader.next returned a record, but the values of the fields were still UTF-8, and I needed to unmarshall the values back into a scala object with applyValues. Something like the hackish thing below worked for me:
val objIterator = asDataFileReader.asScala
.iterator
.map(i => asGenericDatumReader.applyValues(i.asInstanceOf[GenericData.Record]).asInstanceOf[X])