Im having a issue when try to run a jar using spark-submit. This is my sbt file:
name := "Reading From Mongo Project"
version := "1.0"
scalaVersion := "2.10.4"
libraryDependencies += "org.mongodb" %% "casbah" % "2.5.0"
Im using
sbt package
to create jar file. And all looks good. Then, I executed it this way:
spark-submit --class "ReadingFromMongo" --master local /home/bigdata1/ReadingFromMongoScala/target/scala-2.10/reading-from-mongo-project_2.10-1.0.jar
And got this error:
Error: application failed with exception
java.lang.NoClassDefFoundError: com/mongodb/casbah/Imports$
at ReadingFromMongo$.main(ReadingFromMongo.scala:6)
at ReadingFromMongo.main(ReadingFromMongo.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:577)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:174)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:197)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: com.mongodb.casbah.Imports$
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 11 more
My ReadingFromMongo class is this one:
import com.mongodb.casbah.Imports._
object ReadingFromMongo {
def main(args: Array[String]) {
val mongoClient = MongoClient("mongocluster", 27017)
val db = mongoClient("Grupo12")
val coll = db("test")
println("\n\Total: "+coll.count()+"\n")
}
}
I dont know why is this happening. This is the first time Im facing this kind of problem.
Hope someone can help me.
Thanks a lot.
sbt package creates jar with your code, excluding dependencies. So, spark does not know where to take mongo dependencies.
You need either: include mongo and other required dependencies into classpath, or build "fat jar" that will include deps classes.
sbt-assembly plugin help you if you choose second approach.
Related
I have an Apache Spark 2.0 application written in Scala (2.11.12), built using SBT tool 1.2.8. When I'm trying to run the app in Intellij (2020.3.2 Ultimate), I get the following error -
Exception in thread "main" java.lang.NoClassDefFoundError: com/google/common/util/concurrent/internal/InternalFutureFailureAccess
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:756)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
.....
Caused by: java.lang.ClassNotFoundException: com.google.common.util.concurrent.internal.InternalFutureFailureAccess
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
On googling/searching Stackoverflow, it seems this is caused by some weird Guava dependency issues. I have these added to my Dependencies.scala
dependencies += "com.google.guava" % "guava" % "30.1-jre"
dependencies += "com.google.guava" % "failureaccess" % "1.0"
That didn't solve the issue. Also tried adding "com.google.guava" % "listenablefuture" % "1.0" to the dependencies, but that didn't help either. Tried doing File -> Invalidate Caches/Restart in Intellij but I still get the issue.
Could someone please help?
In my case, adding com.google.guava:failureaccess to external libraries to my project (File -> Project Structure -> Libraries) helped.
I'm new to Scala and Spark. I've been frustrated by how hard it has been to get things to work with IntelliJ. Currently, I can't get run the code below. I'm sure it's something simple, but I can't get it to work.
I'm trying to run:
import org.apache.spark.{SparkConf, SparkContext}
object TestScala {
def main(args: Array[String]): Unit = {
val conf = new SparkConf()
conf.setAppName("Datasets Test")
conf.setMaster("local[2]")
val sc = new SparkContext(conf)
println(sc)
}
}
The error I get is:
Exception in thread "main" java.lang.NoSuchMethodError: scala.Predef$.refArrayOps([Ljava/lang/Object;)Lscala/collection/mutable/ArrayOps;
at org.apache.spark.util.Utils$.getCallSite(Utils.scala:1413)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:77)
at TestScala$.main(TestScala.scala:13)
at TestScala.main(TestScala.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)
My build.sbt file:
name := "sparkBook"
version := "1.0"
scalaVersion := "2.12.1"
Change your scalaVersion to 2.11.8 and add the Spark dependency to your build.sbt:
libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "2.0.2"
One more scenario is intellij is pointing to 2.12.4 and all the maven/sbt dependencies are 2.11.8. with scala dep verion 2.11...
I stepped back from 2.12.4 to 2.11.8 at global libraries of intellij ui. and it started working
Details :
Maven pom.xml pointing to 2.11.8 But in my Intellij... sdk is 2.12.4 in global libraries shown below.
Which is causing
java.lang.NoSuchMethodError: scala.Predef$.refArrayOps([Ljava/lang/Object;)Lscala/collection/mutable/ArrayOps;
Stepped back to 2.11.8 in Global libraries.. like below
Thats it.. Problem solved. No more error for executing that program.
Conclusion : Maven dependencies alone should not solve the problem, along with that we have to configure scala sdk in global
libraries since its error is coming while running a spark local
program and error is related to Intellij run time.
If you use spark 2.4.3, you need to use scala 2.11 even though spark website says to use scala 2.12. https://spark.apache.org/docs/latest/
To avoid scala.Predef$.refArrayOps([Ljava/lang/Object;)[Ljava/lang/Object;
I am trying to use Scala 2.12.0-M5 and AKKA 2.4.7 in a project. But I get this error when I try so start AKKA. I also tried using M4.
I am sure I must be missing something in my setup, as this clearly must work. But this is pretty much just what I had using 2.11.8 - 2.4.6.
Any help would be appreciated, thanks!
build.sbt:
name := "AKKA-2.4.8"
version := "1.0"
scalaVersion := "2.12.0-M5"
// https://mvnrepository.com/artifact/com.typesafe.akka/akka-actor_2.11
libraryDependencies += "com.typesafe.akka" % "akka-actor_2.11" % "2.4.8"
code:
package testing
import akka.actor.ActorSystem
/**
* Created by on 7/8/16.<br>
* <br>
* AkkaActor demonstrates my problem when starting AKKA
*/
object AkkaActorStarter extends App {
val actorSystem = ActorSystem("testAkka")
}
Error:
[error] (run-main-0) java.lang.NoClassDefFoundError: scala/Product$class
...
Caused by: java.lang.ClassNotFoundException: scala.Product$class
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at akka.util.Timeout.<init>(Timeout.scala:13)
at akka.actor.ActorSystem$Settings.<init>(ActorSystem.scala:171)
at akka.actor.ActorSystemImpl.<init>(ActorSystem.scala:522)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:142)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:109)
at testing.AkkaActorStarter$.delayedEndpoint$testing$AkkaActorStarter$1(AkkaActorStarter.scala:11)
at testing.AkkaActorStarter$delayedInit$body.apply(AkkaActorStarter.scala:10)
at scala.Function0.apply$mcV$sp$(Function0.scala:34)
at scala.Function0.apply$mcV$sp(Function0.scala:34)
at scala.App.$anonfun$main$1$adapted(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:376)
at scala.App.main$(App.scala:76)
at scala.App.main(App.scala:74)
at testing.AkkaActorStarter.main(AkkaActorStarter.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
The problem is, that you are using akka for Scala 2.11 (akka-actor_2.11) with Scala 2.12. Scala minor versions are not binary compatible, you have to use the akka library that is compiled for your exact Scala version, 2.12.0-M5: "com.typesafe.akka" % "akka-actor_2.12.0-M5" % "2.4.8" or use %%, that will use the proper artifact according to your scalaVersion: "com.typesafe.akka" %% "akka-actor" % "2.4.8"
I'm currently trying to execute some Scala code with Apache Spark on yarn(-client) mode against a Cloudera cluster, but the sbt run execution is aborted by the following Java Exception:
[error] (run-main-0) org.apache.spark.SparkException: YARN mode not available ?
org.apache.spark.SparkException: YARN mode not available ?
at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:1267)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:199)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:100)
at SimpleApp$.main(SimpleApp.scala:7)
at SimpleApp.main(SimpleApp.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.scheduler.cluster.YarnClientClusterScheduler
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:191)
at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:1261)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:199)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:100)
at SimpleApp$.main(SimpleApp.scala:7)
at SimpleApp.main(SimpleApp.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
[trace] Stack trace suppressed: run last compile:run for the full output.
java.lang.RuntimeException: Nonzero exit code: 1
at scala.sys.package$.error(package.scala:27)
[trace] Stack trace suppressed: run last compile:run for the full output.
[error] (compile:run) Nonzero exit code: 1
15/11/24 17:18:03 INFO network.ConnectionManager: Selector thread was interrupted!
[error] Total time: 38 s, completed 24-nov-2015 17:18:04
I suppose the prebuilt Apache Spark distribution is built with yarn support because if I try to execute a spark-submit (yarn-client) mode, there's no any java exception anymore, but yarn does not seem to allocate any resource as I get the same message every second : INFO Client: Application report for application_1448366262851_0022 (state: ACCEPTED). I suppose because of a configuration issue.
I googled this last message but I can't understand what's the yarn (nor where) configuration I have to modify to execute my program with spark on yarn.
Context:
Platform Operating System : Windows 7 x64 Pro
Developments are done under Scala IDE (eclipse), and built with SBT 0.13.9 and Scala 2.10.4.
Hadoop client is the Apache Hadoop 2.6.0, compiled under Windows 7 with 64Bits architecture
HDFS and MapReduce codes developed and executed from the MS Windows platform are successfully executed HDFS and Yarn Client Configuration have been deployed to the Windows platform.
The Spark software used is the prebuilt version of Apache Spark 1.3.0 for Hadoop 2.4+, available at spark.apache.org, but Apache does not specify if it’s built with yarn support or not
Scala Test Program:
Basic Scala program to count lines in a local text file in which a specified word appears. It works when Spark is executed in local mode
UPDATE
Well, the SBT job failed because hadoop-client.jar and spark-yarn.jar were not in the classpath when packaged and executed by SBT.
Now, sbt run is asking for an environment variable SPARK_YARN_APP_JAR and SPARK_JAR with my build.sbt configured like this :
name := "File Searcher"
version := "1.0"
scalaVersion := "2.10.4"
librearyDependencies += "org.apache.spark" %% "spark-core" % "0.9.1"
libraryDependencies += "org.apache.spark" %% "spark-yarn" % "0.9.1" % "runtime"
libraryDependencies += "org.apache.hadoop" % "hadoop-client" % "2.6.0" % "runtime"
libraryDependencies += "org.apache.hadoop" % "hadoop-yarn-client" % "2.6.0" % "runtime"
resolvers += "Maven Central" at "https://repo1.maven.org/maven2"
Is there any way to configure these variables "automatically"? I mean, I can set SPARK_JAR, as this jar came with the Spark installation, but SPARK_YARN_APP_JAR?
When I set manually those variables, I notice the spark motor doesn't consider my custom configuration, even if I set the YARN_CONF_DIR variable. Is there a way to tell SBT to use my local Spark configuration to work?
If it can help, I let the current (ugly) code I'm executing :
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
object SimpleApp {
def main(args: Array[String]) {
val logFile = "src/data/sample.txt"
val sc = new SparkContext("yarn-client", "Simple App", "C:/spark/lib/spark-assembly-1.3.0-hadoop2.4.0.jar",
List("target/scala-2.10/file-searcher_2.10-1.0.jar"))
val logData = sc.textFile(logFile, 2).cache()
val numTHEs = logData.filter(line => line.contains("the")).count()
println("Lines with the: %s".format(numTHEs))
}
}
Thanks!
Thanks
Cheloute
Well, I finally found what was my issue.
Firstly, the SBT project must include spark-core and spark-yarn as runtime dependencies.
Next, the Windows yarn-site.xml must specify as yarn classpath the Cloudera cluster shared classpath (classpath valid on the linux nodes) instead of the Windows classpath. It makes Yarn Resource Manager know where are its stuff, even when executed from Windows.
Finally, delete the topology.py section from the Windows core-site.xml file, to avoid Spark trying to execute it, it doesn't need it to work.
Don't forget to delete any mapred-site.xml to use Yarn/MR2 if needed, and specify all the spark properties actually defined in spark-defaults.conf when using a spark-submit to the command line to run it.
That's it. Everything else should work.
I'm trying to create basic scala project in intellij by using the Activator UI
I'm importing the project to the ide and it compile well
But when im trying to run simple code im getting
Exception in thread "main" java.lang.NoClassDefFoundError: scala/collection/GenTraversableOnce$class
at akka.util.Collections$EmptyImmutableSeq$.<init>(Collections.scala:15)
at akka.util.Collections$EmptyImmutableSeq$.<clinit>(Collections.scala)
at akka.japi.Util$.immutableSeq(JavaAPI.scala:209)
at akka.actor.ActorSystem$Settings.<init>(ActorSystem.scala:150)
at akka.actor.ActorSystemImpl.<init>(ActorSystem.scala:470)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:111)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:104)
at reactivemongo.api.MongoDriver$.reactivemongo$api$MongoDriver$$defaultSystem(api.scala:378)
at reactivemongo.api.MongoDriver$$anonfun$3.apply(api.scala:305)
at reactivemongo.api.MongoDriver$$anonfun$3.apply(api.scala:305)
at scala.Option.getOrElse(Option.scala:120)
at reactivemongo.api.MongoDriver.<init>(api.scala:305)
at example.App$.main(App.scala:10)
at example.App.main(App.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
When the project is loaded there is an error in the project structure
sbt:scala 2.11.2 not in use
What went wrong with the activator ui intellij project generation ?
thanks
miki
I came across this when trying to run spark. This is a incompatibility error between the scala version which was used to compile the dependancy and the version of scala which is used to run your project.
Removing my scala version specification was a hacky way to solve the problem:
// build.sbt
name := "SparkTest"
version := "1.0"
scalaVersion := "2.11.4" <-- remove this
libraryDependencies += "org.apache.spark" % "spark-core_2.10" % "1.3.0"