Why when Maven Build Works good but adding Spark Jar as external Jars gives a compile error “object Apache is not a member of package org” - eclipse

On Eclipse, while setting up spark , even after adding external jars to build path to spark-2.4.3-bin-hadoop2.7/jars/<_all.jar>,
Complier complains about '“object apache is not a member of package org''
Yes, Building dependencies via Maven or SBT would fix it. A question is asked
scalac compile yields "object apache is not a member of package org"
But Question over here is , WHY the traditional way is failing like this ?

If we reffer here , Scala/Spark version compatibility We could see a similar issue. The problem is Scala is NOT backward compatible. Hence each Spark module is complied against specific Scala library. But when we run from eclipse, the eclipse Scala environment may not be compatible that particular scala version of which we have the Spark libraries set up.

Related

Scala dependency in Spark/Pyspark

I want to install Spark in a machine and I don't know if I need Scala as a dependency.
I have installed Java (1.8) as the documentation saids. But I don't know If I need any other dependencies
Scala is included in Spark's distribution, so for using Scala in Spark REPL you don't need to install Scala. Here are scala jars included.
If you are setting up IDE for Spark project then you will need Scala as dependency.

Chisel: Compiling Chisel library on Windows

I have been using sbt on windows and a custom build.sbt script in conjunction with an import Chisel._ in the top-level file in order to generate Verilog from my Chisel source successfully.
I'm trying to get an IDE working on Windows to expedite Chisel development. I've gone with the Eclipse based SCALA IDE http://scala-ide.org/download/sdk.html/
I want to compile the Chisel library so that the import Chisel._ can be resolved locally, without having to go off and download the source from the repository each timeand recompile the source. When I download the Chisel-master repo from Git and include the src\main folder in my SCALA project in the SCALA IDE, I get lots of syntax errors in the Chisel SCALA files that prevent me from building the project.
Has anyone done anything like this before on Windows or have any knowledge of working with the SCALA IDE as it may just be a case of undefined symbols in the project configuration?
Not sure exactly what you did with build.sbt respect to recompile (I think it download it only the first time, then it caches it for the future). But I'm using ScalaIDE for Chisel on linux, using the default build.sbt files, maybe you can try to get it working out of the box first to help narrow down the issue.
Here are the steps I took in order to get ScalaIDE work with Chisel:
the latest Scala IDE uses 2.11.8, the current Chisel repository defaults to 2.11.7. So I had to change all the build.sbt reference to scalaVersion from 2.11.7 to 2.11.8
I used sbteclipse
https://github.com/typesafehub/sbteclipse
To create importable the workspace to setup the compilation dependencies.
Except for chiselFrontEnd. For some reason, this package is not added to the dependency. I have to Add chiselFrontEnd as a javabuildpath dependency manually (Properties/JavaBuildPath, under Projects) for my own projects.
To resolve undefined symbols, you can also add a JAR onto the project build path using Project Properties > Java Build Path > Libraries > Add External JARs...
If you are getting your JARs through Maven / SBT, they should be in:
C:\Users\<name>\.ivy2\local\edu.berkeley.cs\chisel3_2.11\jars
If you are using publish-local with chisel3, your JARs should be in
C:\Users\<name>\.ivy2\cache\edu.berkeley.cs\chisel3_2.11\jars
Note that chisel3 is compiled into one JAR, including coreMacros and chiselFrontend sub-projects
Of course, this is a more quick-and-dirty solution compared to something that can parse SBT files.

Cross-compiled with an incompatible version

I am using eclipse with m2eclipse-scala plugin. Currently, I get the following error message:
exampleA_2.10-2.0.1.jar of module build path is cross-compiled with an incompatible version of Scala (2.10.0). In case this report is mistaken, this check can be disabled in the compiler preference page
It looks like the versions of extracted Scala and Scala IDE match. I just wanted to make sure that this is a "false-negative" as described here and can be safely turned off.
As #The Archetypal Paul suggested, it was because I was using wrong Scala library.
If you are using Scala 2.11 (check at About Scala IDE -> installation details), you can downgrade by following instruction here. It's a lot easier than uninstalling and re-installing Scala IDE as other Stackoverflow posts recommend.
I also faced the same issue->
I am trying to use casbah jar in scala to integrate with mongodb.
After analyzing the problem i found that ->
i am trying to use casbah 2.9.1 version and my scala version is 2.11.8
Root-Cause of such error is : your jar is compiled in 2.9.0 version and you are using scala 2.11.8 version
So, to resolve that i use the jar that is compiled into 2.11 scala version-
<groupId>org.mongodb</groupId>
<artifactId>casbah-core_2.11</artifactId>
<version>3.1.1</version>
I was facing similar issue in Eclipse IDE where I had built a Spark scala project in Maven. The scala version was set to 2.11.
Later, I upgraded Scala-Ide plugin in Eclipse after which my project marked below errors,
exampleA_2.10-2.0.1.jar of module build path is cross-compiled with an incompatible version of Scala (2.10.0). In case this report is mistaken, this check can be disabled in the compiler preference page
Right click project folder > scala > set scala version. Here my scala version was displayed as 2.10. I selected 2.11 and removed all the error messages.

eclipse(set with scala envirnment) : object apache is not a member of package org

As shown in image, its giving error when i am importing the Spark packages. Please help. When i hover there, it shows "object apache is not a member of package org".
I searched on this error, it shows spark jars has not been imported. So, i imported "spark-assembly-1.4.1-hadoop2.2.0.jar" too. But still same error.Below is what i actually want to run:
import org.apache.spark.{SparkConf, SparkContext}
object ABC {
def main(args: Array[String]){
//Scala Main Method
println("Spark Configuration")
val conf = new SparkConf()
conf.setAppName("My First Spark Scala Application")
conf.setMaster("spark://ip-10-237-224-94:7077")
println("Creating Spark Context")
}
}
Adding spark-core jar in your classpath should resolve your issue. Also if you are using some build tools like Maven or Gradle (if not then you should because spark-core has lot many dependencies and you would keep getting such problem for different jars), try to use Eclipse task provided by these tools to properly set classpath in your project.
I was also receiving the same error, in my case it was compatibility issue. As Spark 2.2.1 is not compatible with Scala 2.12(it is compatible with 2.11.8) and my IDE was supporting Scala 2.12.3.
I resolved my error by
1) Importing the jar files from the basic folder of Spark. During the installation of Spark in our C drive we have a folder named Spark which contains Jars folder in it. In this folder one can get all the basic jar files.
Goto to Eclipse right click on the project -> properties-> Java Build Path. Under 'library' category we will get an option of ADD EXTERNAL JARs.. Select this option and import all the jar files of 'jars folder'. click on Apply.
2) Again goto properties -> Scala Compiler ->Scala Installation -> Latest 2.11 bundle (dynamic)*
*before selecting this option one should check the compatibility of SPARK and SCALA.
The problem is Scala is NOT backward compatible. Hence each Spark module is complied against specific Scala library. But when we run from eclipse, we have one SCALA VERSION which was used to compile and create the spark Dependency Jar which we add to the build path, and SECOND SCALA VERSION is there as the eclipse run time environment. Both may conflict.
This is a hard reality, although, we wish Scala to be ,backward compatible. Or at least a complied jar file created could be backward compatible.
Hence, the recommendation is , use Maven or similar where dependency version can be managed.
If you are doing this in the context of Scala within a Jupyter Notebook, you'll get this error. You have to install the Apache Toree kernel:
https://github.com/apache/incubator-toree
and create your notebooks with that kernel.
You also have to start the Jupyter Notebook with:
pyspark

Scala IDE and Apache Spark -- different scala library version found in the build path

I have some main object:
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
object Main {
def main(args: Array[String]) {
val sc = new SparkContext(
new SparkConf().setMaster("local").setAppName("FakeProjectName")
)
}
}
...then I add spark-assembly-1.3.0-hadoop2.4.0.jar to the build path in Eclipse from
Project > Properties... > Java Build Path :
...and this warning appears in the Eclipse console:
More than one scala library found in the build path
(C:/Program Files/Eclipse/Indigo 3.7.2/configuration/org.eclipse.osgi/bundles/246/1/.cp/lib/scala-library.jar,
C:/spark/lib/spark-assembly-1.3.0-hadoop2.4.0.jar).
This is not an optimal configuration, try to limit to one Scala library in the build path.
FakeProjectName Unknown Scala Classpath Problem
Then I remove Scala Library [2.10.2] from the build path, and it still works. Except now this warning appears in the Eclipse console:
The version of scala library found in the build path is different from the one provided by scala IDE:
2.10.4. Expected: 2.10.2. Make sure you know what you are doing.
FakeProjectName Unknown Scala Classpath Problem
Is this a non-issue? Either way, how do I fix it?
This is often a non-issue, especially when the version difference is small, but there are no guarantees...
The problem is (as stated in the warning) that your project has two Scala libraries on the class path. One is explicitly configured as part of the project; this is version 2.10.2 and is shipped with the Scala IDE plugins. The other copy has version 2.10.4 and is included in the Spark jar.
One way to fix the problem is to install a different version of Scala IDE, that ships with 2.10.4. But this is not ideal. As noted here, Scala IDE requires every project to use the same library version:
http://scala-ide.org/docs/current-user-doc/gettingstarted/index.html#choosing-what-version-to-install
A better solution is to clean up the class path by replacing the Spark jar you are using. The one you have is an assembly jar, which means it includes every dependency used in the build that produced it. If you are using sbt or Maven, then you can remove the assembly jar and simply add Spark 1.3.0 and Hadoop 2.4.0 as dependencies of your project. Every other dependency will be pulled in during your build. If you're not using sbt or Maven yet, then perhaps give sbt a spin - it is really easy to set up a build.sbt file with a couple of library dependencies, and sbt has a degree of support for specifying which library version to use.
The easiest solution:
In Eclipse :
1. Project/ (righclick) Properties
2. Go to Scala Compiler
3. click Use Project Settings
4. set Scala Installation to a compatible version. Generally Fixed Scala Installation 2.XX.X (build-in)
5. Rebuild the project.
There are 2 types of Spark JAR files (just by looking at the Name):
- Name includes the word "assembly" and not "core" (has Scala inside)
- Name includes the word "core" and not "assembly" (no Scala inside).
You should include the "core" type in your Build Path via “Add External Jars”
(the version you need) since the Scala IDE already shoves one Scala for you.
Alternatively, you can just take advantage of the SBT and add the following
Dependency (again, pay attention to the versions you need):
libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "2.1.0"
Then you should NOT include “forcefully” any spark JAR in the Build Path.
Happy sparking:
Zar
>