I tried solutions suggested in similar existing post but none works for me :-( getting really hopeless, so I decided to post this as a new question.
I tried a tutorial (link below) on building a first scala or java application with Spark in a Cloudera VM.
this is my spark-submit command and its output
[cloudera#quickstart sparkwordcount]$ spark-submit --class com.cloudera.sparkwordcount.SparkWordCount --master local /home/cloudera/src/main/scala/com/cloudera/sparkwordcount/target/sparkwordcount-0.0.1-SNAPSHOT.jar
java.lang.ClassNotFoundException: com.cloudera.sparkwordcount.SparkWordCount
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.spark.util.Utils$.classForName(Utils.scala:176)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:689)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
[cloudera#quickstart sparkwordcount]$ spark-submit --class com.cloudera.sparkwordcount.SparkWordCount --master local /home/cloudera/src/main/scala/com/cloudera/sparkwordcount/target/sparkwordcount-0.0.1-SNAPSHOT.jar
I also tried updating the pom.xml file with my actual CDH, Spark and Scala versions but still not working.
When I extract the jar file previously generated by maven using mvn package, I cannot find any .class file inside its hiearachy of folders.
Sorry, I am bit new to Cloudera and Spark. I basically tried following the following tutorial with Scala: https://blog.cloudera.com/blog/2014/04/how-to-run-a-simple-apache-spark-app-in-cdh-5/
I checked the class, folder and scala file names quite a few names very closely, specially lower/uppercase issues, nothing seemed wrong.
I opened my jar and there is some file hierarchy and in the deepest folder I can find again the pom.xml file, but I cannot see any .class files anywhere inside the jar. Does it mean the compilation via "mvn package" didn't actually work, even though the console output said Building went successful?
I was having same issue. Try rerunning by changing class name from
--class com.cloudera.sparkwordcount.SparkWordCount
to
--class SparkWordCount
The full command i used looked like:
spark-submit --class SparkWordCount --master local --deploy-mode client --executor-memory 1g --name wordcount --conf "spark.app.id=wordcount" target/sparkwordcount-0.0.1-SNAPSHOT.jar /user/cloudera/inputfile.txt 2
Related
After building the app jar for flink when I submit the job I see error as below but the jar is available and added to sbt as well as the enable plugin list:
Submitting job...
/opt/flink/bin/flink run --jobmanager flink-archiver-kinesis2iceberg-
jobmanager:8081 --class io.archiver.Job --
parallelism 2 --detached /opt/artifacts/sp-archive-scala-2.12.jar
java.lang.NoClassDefFoundError: Could not initialize class
org.apache.flink.runtime.util.HadoopUtils
at io.archiver.Job$.main(Job.scala:54)
at io.archiver.Job.main(Job.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
I am following Spark Quick Start tutorial page
I reached the last point, compiled my file to a JAR that should be ready to go.
Running my application from the terminal:
spark-submit --class "SimpleApp" --master local[4] /usr/local/spark/target/scala-2.11
Gives the following error:
2018-10-07 20:29:17 WARN Utils:66 - Your hostname, test-ThinkPad-X230 resolves to a loopback address: 127.0.1.1; using 172.17.147.32 instead (on interface wlp3s0)
2018-10-07 20:29:17 WARN Utils:66 - Set SPARK_LOCAL_IP if you need to bind to another address
2018-10-07 20:29:17 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
java.lang.ClassNotFoundException: SimpleApp
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.util.Utils$.classForName(Utils.scala:239)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:851)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
2018-10-07 20:29:18 INFO ShutdownHookManager:54 - Shutdown hook called
2018-10-07 20:29:18 INFO ShutdownHookManager:54 - Deleting directory /tmp/spark-08d94e7e-ae24-4892-a704-727a6caa1733
Why won't it find my SimpleApp class? I've tried giving it the full path. My SimpleApp.scala is in my root Spark folder, /usr/local/spark/
Best way to deploy your app to spark is to use sbt assembly plugin. It will create a fat jar that contains all your dependencies. After packaging your app you have to point spark to the jar directly.
Good luck.
Add your Spark JAR in your spark submit. A spark-submit submit looks as below:
./bin/spark-submit
--class
--master
--deploy-mode
application-jar is the JAR file that you have build.
Hope this helps :)
I have packaged an application in a jar file using sbt for this purpose.
When I run the app from the IDE (IntelliJ) it works without issues.
However, when I try to run directly the jar, I have 2 different issues.
When I run it from spark-submit, I get:
[cloudera#quickstart bin]$ spark-submit --class com.my.app.main --master local[0] /home/cloudera/Projects/myapp/target/scala-2.11/myapp.jar
Exception in thread "main" java.lang.NoClassDefFoundError: com/microsoft/sqlserver/jdbc/SQLServerDataSource
When I run it from java I get:
[cloudera#quickstart scala-2.11]$ java -jar myapp.jar
Exception in thread "main" java.lang.NoClassDefFoundError: scala/collection/Seq
at com.my.app.main$.main(main.scala:13)
at com.my.app.main.main(main.scala)
Caused by: java.lang.ClassNotFoundException: scala.collection.Seq
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 2 more
Remark that the JDBC for SQL Server is already placed in the lib folder, where it's suppossed to be automatically recognized by sbt when it generates the package.
Any help will be very appreciated.
Thank you.
EDIT: My question is not answered on that post
taken from
https://stackoverflow.com/a/52546145/1498109
i would modify for your case:
spark-submit --class com.my.app.main \
--master local[0] /home/cloudera/Projects/myapp/target/scala-2.11/myapp.jar \
--jars path_to/sqljdbc42.jar
this one worked for me, donwload the jdbc jar from official microsoft site.
I'm doing a basic program of implementing Drools and the program runs on an application configuration but when I try to run the JAR, I face an error.
The error I get on the terminal:
`Suhita-MacBookPro:Drool-CreditScore-Sample sgoswami$ spark-submit --class main.scala.suhita.Sample --master local[*] target/DroolsMaven-1.0-SNAPSHOT.jar
java.lang.ClassNotFoundException: main.scala.suhita.Sample
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.util.Utils$.classForName(Utils.scala:230)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:712)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)`
Drools-Project
>src
>main
>scala
>suhita
- Sample
- Applicant
>META-INF
-kmodule.xml
-manifest.MF
>resources.rules
-rules
This happens sometime when your classes are not loaded property. I have seen a few times recently. So there are two ways to fix this issue:
Refresh classes that may include sbt clean compile
or you can try reloading idea classes from top menu. It's very vague to say but sometimes restarting intellij also works as it reloads all the classes again.
I am sure one the method will work. Let me know if it persists.
Looking at the tree structure of your project
Drools-Project
>src
>main
>scala
>suhita
- Sample
- Applicant
You don't need to provide class name from main as main.scala.suhita.Sample
Simply use suhita.Simple for class name as
spark-submit --class suhita.Sample --master local[*] target/DroolsMaven-1.0-SNAPSHOT.jar
and it should work
I am trying to run a Spark 2.1 application on Cloudera cluster which does not yet support Spark 2.
I was following answers:
https://stackoverflow.com/a/44434835/1549135
https://stackoverflow.com/a/41359175/1549135
Which seem to be correct, however I get a strange error during spark-submit:
Exception in thread "main" java.lang.NoSuchMethodError: scala.runtime.IntRef.create(I)Lscala/runtime/IntRef;
at scopt.OptionParser.parse(options.scala:370)
at com.rxcorp.cesespoke.config.WasherConfig$.parse(WasherConfig.scala:22)
at com.rxcorp.cesespoke.Process$.main(Process.scala:27)
at com.rxcorp.cesespoke.Process.main(Process.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:729)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Using Denis Makarenko answer hint I have added:
spark-submit \
...
--conf 'spark.executor.extraJavaOptions=-verbose:class' \
--conf 'spark.driver.extraJavaOptions=-verbose:class' \
...
Just to see that, as said in the answer - we are running on the wrong classpath here! Checking the logs, I could clearly find:
[Loaded scala.runtime.IntRef from file:/opt/cloudera/parcels/CDH-5.8.4-1.cdh5.8.4.p0.5/jars/spark-assembly-1.6.0-cdh5.8.4-hadoop2.6.0-cdh5.8.4.jar]
Which is obviously the source of the problem.
After carefully checking the given posts from the beginning:
You should use spark-submit from the newer Spark installation (I'd
suggest using the latest and greatest 2.1.1 as of this writing) and
bundle all Spark jars as part of your Spark application.
So this is how I will follow!
I also recommend on reading:
http://www.mostlymaths.net/2017/05/shading-dependencies-with-sbt-assembly.html
Exception in thread "main" java.lang.NoSuchMethodError:
scala.runtime.IntRef.create(I)Lscala/runtime/IntRef;
NoSuchMethodError often indicates a jar version mismatch. Since the missing method is in the scala.runtime package, most likely the problem is caused by compiling the code with one version of Scala, say 2.11, and running it with another one (2.10).
Check the Scala version in your build.sbt (scalaVersion := ...) and run JVM with -verbose:class parameter to make sure these Scala versions match.