NoSuchMethodError while running Spark Streaming job on HDP 2.2 - scala

I am trying to run a simple streaming job on HDP 2.2 Sandbox but facing java.lang.NoSuchMethodError error. I am able to run SparkPi example on this machine without an issue.
Following are the versions I am using-
<kafka.version>0.8.2.0</kafka.version>
<twitter4j.version>4.0.2</twitter4j.version>
<spark-version>1.2.1</spark-version>
<scala.version>2.11</scala.version>
Code Snippet -
val sparkConf = new SparkConf().setAppName("TweetSenseKafkaConsumer").setMaster("yarn-cluster");
val ssc = new StreamingContext(sparkConf, Durations.seconds(5));
Error text from Node Manager UI-
Exception in thread "Driver" scala.MatchError:
java.lang.NoSuchMethodError:
scala.Predef$.$conforms()Lscala/Predef$$less$colon$less; (of class
java.lang.NoSuchMethodError) at
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:432)
15/02/12 15:07:23 INFO yarn.ApplicationMaster: Waiting for spark
context initialization ... 1 15/02/12 15:07:33 INFO
yarn.ApplicationMaster: Waiting for spark context initialization ... 2
Job is accepted in YARN but it never goes into RUNNING status.
I was thinking it is due to Scala version differences. I tried changing POM configuration but still not able to fix the error.
Thank you for your help in advance.

Earlier I specified dependency for spark-streaming_2.10 ( Spark compiled with Scala 2.10). I did not specify dependency for Scala compiler itself. It seems Maven automatically pulled 2.11 (Maybe due to some other dependency). When trying to debug this issue, I added a dependency on Scala compiler 2.11. Now after Paul's comment I changed that Scala dependency version to 2.10 and it is working.

Related

java.lang.NoSuchMethodError: scala.Product.$init$(Lscala/Product;)V - Flink on EMR

I am trying to run a Flink (v 1.13.1) application on EMR ( v
5.34.0).
My Flink application uses Scallop(v 4.1.0) to parse the arguments passed.
Scala version used for Flink application is
2.12.7.
I keep getting below error when I submit the flink application to the cluster. Any clue or help is highly appreciated.
java.lang.NoSuchMethodError: scala.Product.$init$(Lscala/Product;)V
at org.rogach.scallop.Scallop.<init>(Scallop.scala:63)
at org.rogach.scallop.Scallop$.apply(Scallop.scala:13)
Resolved the issue by downgrading Scala to 2.11. Flink 1.13.1 Scala shell REPL on EMR mentioned scala version 2.11.12 hence downgraded to that version of Scala and this problem has disappeared.

How to fix initialization of Logger error while using spark-submit command

I've got the problem when run my spark-jdbc job to connect to another db. But I've got error before.
Exception in thread "main" java.lang.AbstractMethodError
at org.apache.spark.internal.Logging$class.initializeLogIfNecessary(Logging.scala:99)
My Logger wesn't able to be initialized by scala.
I'm using scala 2.11 and spark with the same versions.
Can't debug this issue via IDE, cause there all is fine, but when I run spark-submit, then error happens.
Got the same error while using Spark 2.3 and I should've been using Spark 2.2. Apparently that method was made abstract in the later version so I was getting that error.
Try this

ClassNotFoundException while creating Spark Session

I am trying to create a Spark Session in Unit Test case using the below code
val spark = SparkSession.builder.appName("local").master("local").getOrCreate()
but while running the tests, I am getting the below error:
java.lang.ClassNotFoundException: org.apache.hadoop.fs.GlobalStorageStatistics$StorageStatisticsProvider
I have tried to add the dependency but to no avail. Can someone point out the cause and the solution to this issue?
It can be because of two reasons.
1. You may have incompatible versions of spark and Hadoop stacks. For example, HBase 0.9 is incompatible with spark 2.0. It will result in the class/method not found exceptions.
2. You may have multiple version of the same library because of dependency hell. You may need to run the dependency tree to make sure this is not the case.

Scala Runtime errors calling program on Spark Job Server

I used spark 1.6.2 and Scala 11.8 to compile my project. The generated uber jar with dependencies is placed inside Spark Job Server (that seems to use Scala 10.4 (SCALA_VERSION=2.10.4 specified in .sh file)
There is no problem in starting the server, uploading context/ app jars. But at runtime, the following errors occur
java.lang.NoSuchMethodError: scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;)Lscala/reflect/api/JavaUniverse$JavaMirror
Why do Scala 2.11 and Spark with scallop lead to "java.lang.NoSuchMethodError: scala.reflect.api.JavaUniverse.runtimeMirror"? talks about using Scala 10 to compile the sources. Is it true?
Any suggestions please...
Use scala 2.10.4 to compile your project. Otherwise you need to compile spark with 11 too.

Spark and Prediction IO: NoClassDefFoundError Despite Dependency Existing

Problem:
I am attempting to train a Prediction IO project using Spark 1.6.1 and PredictionIO 0.9.5, but the job fails immediately after the Executors begin to work. This happens both in a Stand-Alone spark cluster and a Mesos cluster. In both cases I am deploying to the cluster from a remote client i.e. I am running pio train -- --master [master on some other server] .
Symptoms:
In the driver logs, shortly after the first [Stage 0:> (0 + 0) / 2] message, the executors die due to java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.hbase.protobuf.ProtobufUtil
Investigation:
Found the class-in-question within the pio-assembly jar:
jar -tf pio-assembly-0.9.5.jar | grep ProtobufUtil
org/apache/hadoop/hbase/protobuf/ProtobufUtil$1.class
org/apache/hadoop/hbase/protobuf/ProtobufUtil.class
When submitting, this jar is deployed with the project and can be found within the executors
Adding --jars pio-assembly-0.9.5.jar to pio train does not fix the problem
Creating an uber jar with pio build --clean --uber-jar does not fix the problem
Setting SPARK_CLASSPATH on the slaves to a local copy of pio-assembly-0.9.5.jar does solve the problem
As far as I am aware, SPARK_CLASSPATH is deprecated and should be replaced with --jars when submitting. I'd rather not be dependant on a deprecated feature. Is there something I am missing when calling pio train or with my infrastructure? Is there a defect (e.g. race condition) with the executors fetching the dependencies from the driver?
The problem is that java.lang.NoClassDefFoundError: Could not initialize class doesn't actually mean that the dependency is not there, but rather it is a poorly named exception and the real problem is that the class loader had trouble loading the class. The actual problem will be reported in the form of java.lang.ExceptionInInitializerError which will likely be thrown from a static code block. It is hard to tell the difference betweenjava.lang.NoClassDefFoundError and java.lang.ClassNotFoundException, but the latter is what actually means that the dependency is missing (this question and others provide more details).