how to solve the sparkSession class not found in intellij project - scala

While submitting scala spark project I am getting this error
Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.SparkSession$
at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
After many repeated attempts I am not able to solve this problem. Can anybody help me to submit my project
I have tried revalidating the cache and submitted the project still not solved

Related

Sudden problem with Apache log4j Appender when trying to launch SBT

I was executing a scala program with SBT and needed to stop the execution, so I hit Ctrl+c to end execution, which also ends the execution of SBT. I've done this a thousand times, but this time SBT wouldn't restart, and gives me this error:
java.lang.NoClassDefFoundError: org/apache/logging/log4j/core/Appender
at sbt.StandardMain$.initialGlobalLogging(Main.scala:114)
at sbt.StandardMain$.initialState(Main.scala:136)
at sbt.xMain.run(Main.scala:70)
at xsbt.boot.Launch$.$anonfun$run$1(Launch.scala:149)
at xsbt.boot.Launch$.withContextLoader(Launch.scala:176)
at xsbt.boot.Launch$.run(Launch.scala:149)
at xsbt.boot.Launch$.$anonfun$apply$1(Launch.scala:44)
at xsbt.boot.Launch$.launch(Launch.scala:159)
at xsbt.boot.Launch$.apply(Launch.scala:44)
at xsbt.boot.Launch$.apply(Launch.scala:21)
at xsbt.boot.Boot$.runImpl(Boot.scala:78)
at xsbt.boot.Boot$.run(Boot.scala:73)
at xsbt.boot.Boot$.main(Boot.scala:21)
at xsbt.boot.Boot.main(Boot.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.logging.log4j.core.Appender
at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:471)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:588)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
... 14 more
[error] [launcher] error during sbt launcher: java.lang.NoClassDefFoundError: org/apache/logging/log4j/core/Appender
I tried installing a new version of SBT but that didn't work, and I get the same error.
I'm at a complete loss as to how to fix this problem. I really don't even know what the problem is. Thanks for any help.
I was able to get things working but I won't guarantee this would be a fix for everyone. Deleting my .sbt folder and restarting sbt worked. It created a new .sbt folder and everything is working correctly now. I only use sbt for compiling scala code so there isn't much that depends on sbt. Definitely back up your old .sbt folder if you're going to try this route.

Spark submit in IntelliJ of project jar yields Class not found error

I'm doing a basic program of implementing Drools and the program runs on an application configuration but when I try to run the JAR, I face an error.
The error I get on the terminal:
`Suhita-MacBookPro:Drool-CreditScore-Sample sgoswami$ spark-submit --class main.scala.suhita.Sample --master local[*] target/DroolsMaven-1.0-SNAPSHOT.jar
java.lang.ClassNotFoundException: main.scala.suhita.Sample
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.util.Utils$.classForName(Utils.scala:230)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:712)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)`
Drools-Project
>src
>main
>scala
>suhita
- Sample
- Applicant
>META-INF
-kmodule.xml
-manifest.MF
>resources.rules
-rules
This happens sometime when your classes are not loaded property. I have seen a few times recently. So there are two ways to fix this issue:
Refresh classes that may include sbt clean compile
or you can try reloading idea classes from top menu. It's very vague to say but sometimes restarting intellij also works as it reloads all the classes again.
I am sure one the method will work. Let me know if it persists.
Looking at the tree structure of your project
Drools-Project
>src
>main
>scala
>suhita
- Sample
- Applicant
You don't need to provide class name from main as main.scala.suhita.Sample
Simply use suhita.Simple for class name as
spark-submit --class suhita.Sample --master local[*] target/DroolsMaven-1.0-SNAPSHOT.jar
and it should work

Trying to run Scala test, getting java.lang.ClassNotFoundException: org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner

I have a Gradle project in IntelliJ IDEA 2016.2. Everytime I run the Scala tests in the project, I get the following exception:
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.intellij.rt.execution.CommandLineWrapper.main(CommandLineWrapper.java:48)
Caused by: java.lang.ClassNotFoundException: org.jetbrains.plugins.scala.testingSupport.scalaTest.ScalaTestRunner
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:123)
... 5 more
I checked the versions of the dependencies and I have added the Scala SDK to the project module as well. I also added the Scala plugin to the Gradle file and installed the Scala plugin in IntelliJ IDEA. Also, the tests run without an error on my colleague's computer so we have no idea what the error could be.
Found the cause: I have an accentuated letter in my user directory's name and IDEA is always trying to use some file from AppData under that directory. I have already changed the idea.properties file, but it has no effect regarding that file.
A possible workaround is using gradle (or maven/sbt/etc.). In my case, I use gradle, I just add #RunWith(classOf[JUnitRunner]) to the scala class I want to test, then execute gradle's test task.
For me the solution of command line length limitation was crucial. Idea offers about 3 ways of how to overcome too long command. Choose another and check.
It's in run configurations.

ClassNotFoundException for Spark job on Yarn-cluster mode

So I am trying to run a Spark job on Yarn-cluster mode kicked off via Oozie workflow, but have been encountering the following error (relevant stacktrace below)
java.sql.SQLException: ERROR 103 (08004): Unable to establish connection.
at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:388)
at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection(ConnectionQueryServicesImpl.java:296)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.access$300(ConnectionQueryServicesImpl.java:179)
at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:1917)
at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:1896)
at org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:77)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:1896)
at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:180)
at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.connect(PhoenixEmbeddedDriver.java:132)
at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:151)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:208)
...
Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:240)
at org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:414)
at org.apache.hadoop.hbase.client.ConnectionManager.createConnectionInternal(ConnectionManager.java:323)
at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:144)
at org.apache.phoenix.query.HConnectionFactory$HConnectionFactoryImpl.createConnection(HConnectionFactory.java:47)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection(ConnectionQueryServicesImpl.java:294)
... 28 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238)
... 33 more
Caused by: java.lang.UnsupportedOperationException: Unable to find org.apache.hadoop.hbase.ipc.controller.ClientRpcControllerFactory
at org.apache.hadoop.hbase.util.ReflectionUtils.instantiateWithCustomCtor(ReflectionUtils.java:36)
at org.apache.hadoop.hbase.ipc.RpcControllerFactory.instantiate(RpcControllerFactory.java:58)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.createAsyncProcess(ConnectionManager.java:2317)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:688)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:630)
... 38 more
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.ipc.controller.ClientRpcControllerFactory
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at org.apache.hadoop.hbase.util.ReflectionUtils.instantiateWithCustomCtor(ReflectionUtils.java:32)
... 42 more
Some background information:
The job runs on spark 1.4.1 (specified correct spark.yarn.jar field in the spark.conf file).
oozie.libpath is set to the hdfs directory in which the jar of my program resides.
org.apache.hadoop.hbase.ipc.controller.ClientRpcControllerFactory, the class not found, exists in phoenix-4.5.1-HBase-1.0-client.jar. I've specified this jar in spark.driver.extraClassPath and spark.executor.extraClassPath in my spark.conf file. I've also added the phoenix-core dependency in my pom file, so that the class exists in my shaded project jar as well.
Observations so far:
adding an extra field in my spark.conf file spark.driver.userClassPathFirst and setting it to true gets rid of the classnotfound exception. However, it also prevents me from initializing a spark context (null pointer exception). From googling around it seems that including this field messes up classpaths, so may not be the way to go about it since I cannot even initialize a spark context this way.
I noticed that in the oozie stdout log, I do not see the classpath of the phoenix jar. So maybe for some reason spark.driver.extraClassPath and spark.executor.extraClassPath aren't actually picking up the jar as an extraClassPath? I do know that I'm specifying the correct jar file path, as other jobs have spark.conf files with the same parameters.
I found a hacky way to make the phoenix jar show up in the classpath (in the oozie stdout log) by copying the jar to the same directory as where my program jar resides. This works whether or not spark.executor.extraClassPath is changed to point to the new jar location. However, the classnotfound exception persists, even though I clearly see the ClientRpcControllerFactory jar when I unzip the jar)
Other things I've tried:
I tried using the sparkConf.setJars() and sparkContext.addJar() methods, but still encountered the same error
added the jar in the spark.driver.extraClassPath field in my job properties file, but it hasn't seemed to help (Spark docs indicated that this field is necessary when running in client mode, so may not be relevant for my case)
Any help/ideas/suggestions would be greatly appreciated.
I use CDH 5.5.1 + Phoenix 4.5.2 (both installed with parcels) and faced the same problem. I think the problem disappeared after I switched to client mode. I can't verify this because I am getting other error with cluster mode now.
I tried to trace Phoenix source code and found some interesting things. Hope Java / Scala expert identify the root cause.
The PhoenixDriver class was loaded.
This showed the jar was found initially. After layers of Class Loader /
context switch (?), the jar lost from the classpath.
If I Class.forName() a non-existing class in my program, there is no need to call sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331). The stack is like:
java.lang.ClassNotFoundException: NONEXISTINGCLASS
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
I copied Phoenix code into my program for testing. I still get the ClassNotFoundExcpetion if I call ConnectionQueryServicesImpl.init (ConnectionQueryServicesImpl.java:1896). However, a call to ConnectionQueryServicesImpl.openConnection (ConnectionQueryServicesImpl.java:296) returned usable HBase connection. So it seems PhoenixContextExecutor was causing the loss of the jar, but I don't know how.
Source code of Cloudera Phoenix 4.5.2 : https://github.com/cloudera-labs/phoenix/blob/phoenix1-4.5.2_1.2.0/phoenix-core/src/main/java/org/apache/
(Not sure whether I should post a comment... but I have no reputation anyway)
So I managed to fix my issue and get my job to run. My solution is very hacky, but will post it here in case it may help others in the future.
Basically, the problem as I understand it was that the org.apache.hadoop.hbase.util.ReflectionUtils class, which is responsible for finding the ClientRpcControllerFactory class, was being loaded from some cloudera directory in the cluster instead of from my own jar. When I set spark.driver.userClassPathFirst to true, it prioritized loading the ReflectionUtils class from my jar, and so was able to location the ClientRpcControllerFactory class. But that messed up some other classpaths and kept giving me a NullPointerException when I tried to initialize a SparkContext, so I looked for another solution.
I tried to figure out if it was possible to exclude all default cdh jars from being included in my classpath, but found that the value in spark.yarn.jar was pulling in all these cdh jars, and I definitely needed to specify that jar.
So the solution was to include all classes under org.apache.hadoop.hbase from the Phoenix jar into spark-assembly jar (the jar that spark.yarn.jar pointed to), which got rid of the original exception and did not give me a NPE when trying to initialize a SparkContext. I found that now the ReflectionUtils class was being loaded from the spark-assembly jar, and since the ClientRpcControllerFactorywas also included in that jar, it was able to find it. After this, I encountered a few more classNotFoundExceptions for Phoenix classes, so I put those classes into the spark-assembly jar as well.
Finally, I had a java.lang.RuntimeException: hbase-default.xml File Seems to be for and old Version of HBase problem. I found that my application jar contained such a file, but changing hbase.defaults.for.version.skip to true didn't do anything. So I included another hbase-default.xml file in the spark-assembly jar with the skip flag to true, and it finally worked.
Some observations:
I noticed that my spark-assembly jar was completely missing an org.apache.hadoop.hbase directory. A coworker told me that usually I should expect to find an hbase directory in my spark-assembly jar, so maybe I was working with a bad spark-assembly jar. Edit: I checked a spark-assembly jar that I newly downloaded (v1.5.2) and it doesn't have it, so maybe the apache.hadoop.hbase package is not included in it.
ClassNotFoundExceptions and classloader problems are hard to debug.

ClassNotFoundException when running my plugins in a new eclipse configuration instance

I am running two eclipse plugins that I developped. These plugins are successfully imported to the new workspace within the persisted container Library. So far everything is OK, the problems started while I was trying to test my plugins :
1) I couldn't neither gain from eclipse's text auto-completion to navigate the classes of my plugins or their methods
2) When I run this test I get a classNotfoundException here is the error :
Caused by: java.lang.ClassNotFoundException:
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Exception in thread "main"
Trying to remedy to this problem, I applied the solution proposed by eran_levi : "Running my eclipse plugin end up with ClassNotFoundException" but it didn't work for me.
Any sort of help is really appreciated.
Thank you in Advance !!!
I wanted to play clever and I didn't export all the plugins while defining the running configuration (only my two plugins and some equinox plugins). When I imported all the plugins everything worked well.