Scala IDE and Apache Spark -- different scala library version found in the build path - eclipse

I have some main object:
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
object Main {
def main(args: Array[String]) {
val sc = new SparkContext(
new SparkConf().setMaster("local").setAppName("FakeProjectName")
)
}
}
...then I add spark-assembly-1.3.0-hadoop2.4.0.jar to the build path in Eclipse from
Project > Properties... > Java Build Path :
...and this warning appears in the Eclipse console:
More than one scala library found in the build path
(C:/Program Files/Eclipse/Indigo 3.7.2/configuration/org.eclipse.osgi/bundles/246/1/.cp/lib/scala-library.jar,
C:/spark/lib/spark-assembly-1.3.0-hadoop2.4.0.jar).
This is not an optimal configuration, try to limit to one Scala library in the build path.
FakeProjectName Unknown Scala Classpath Problem
Then I remove Scala Library [2.10.2] from the build path, and it still works. Except now this warning appears in the Eclipse console:
The version of scala library found in the build path is different from the one provided by scala IDE:
2.10.4. Expected: 2.10.2. Make sure you know what you are doing.
FakeProjectName Unknown Scala Classpath Problem
Is this a non-issue? Either way, how do I fix it?

This is often a non-issue, especially when the version difference is small, but there are no guarantees...
The problem is (as stated in the warning) that your project has two Scala libraries on the class path. One is explicitly configured as part of the project; this is version 2.10.2 and is shipped with the Scala IDE plugins. The other copy has version 2.10.4 and is included in the Spark jar.
One way to fix the problem is to install a different version of Scala IDE, that ships with 2.10.4. But this is not ideal. As noted here, Scala IDE requires every project to use the same library version:
http://scala-ide.org/docs/current-user-doc/gettingstarted/index.html#choosing-what-version-to-install
A better solution is to clean up the class path by replacing the Spark jar you are using. The one you have is an assembly jar, which means it includes every dependency used in the build that produced it. If you are using sbt or Maven, then you can remove the assembly jar and simply add Spark 1.3.0 and Hadoop 2.4.0 as dependencies of your project. Every other dependency will be pulled in during your build. If you're not using sbt or Maven yet, then perhaps give sbt a spin - it is really easy to set up a build.sbt file with a couple of library dependencies, and sbt has a degree of support for specifying which library version to use.

The easiest solution:
In Eclipse :
1. Project/ (righclick) Properties
2. Go to Scala Compiler
3. click Use Project Settings
4. set Scala Installation to a compatible version. Generally Fixed Scala Installation 2.XX.X (build-in)
5. Rebuild the project.

There are 2 types of Spark JAR files (just by looking at the Name):
- Name includes the word "assembly" and not "core" (has Scala inside)
- Name includes the word "core" and not "assembly" (no Scala inside).
You should include the "core" type in your Build Path via “Add External Jars”
(the version you need) since the Scala IDE already shoves one Scala for you.
Alternatively, you can just take advantage of the SBT and add the following
Dependency (again, pay attention to the versions you need):
libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "2.1.0"
Then you should NOT include “forcefully” any spark JAR in the Build Path.
Happy sparking:
Zar
>

Related

Chisel: Compiling Chisel library on Windows

I have been using sbt on windows and a custom build.sbt script in conjunction with an import Chisel._ in the top-level file in order to generate Verilog from my Chisel source successfully.
I'm trying to get an IDE working on Windows to expedite Chisel development. I've gone with the Eclipse based SCALA IDE http://scala-ide.org/download/sdk.html/
I want to compile the Chisel library so that the import Chisel._ can be resolved locally, without having to go off and download the source from the repository each timeand recompile the source. When I download the Chisel-master repo from Git and include the src\main folder in my SCALA project in the SCALA IDE, I get lots of syntax errors in the Chisel SCALA files that prevent me from building the project.
Has anyone done anything like this before on Windows or have any knowledge of working with the SCALA IDE as it may just be a case of undefined symbols in the project configuration?
Not sure exactly what you did with build.sbt respect to recompile (I think it download it only the first time, then it caches it for the future). But I'm using ScalaIDE for Chisel on linux, using the default build.sbt files, maybe you can try to get it working out of the box first to help narrow down the issue.
Here are the steps I took in order to get ScalaIDE work with Chisel:
the latest Scala IDE uses 2.11.8, the current Chisel repository defaults to 2.11.7. So I had to change all the build.sbt reference to scalaVersion from 2.11.7 to 2.11.8
I used sbteclipse
https://github.com/typesafehub/sbteclipse
To create importable the workspace to setup the compilation dependencies.
Except for chiselFrontEnd. For some reason, this package is not added to the dependency. I have to Add chiselFrontEnd as a javabuildpath dependency manually (Properties/JavaBuildPath, under Projects) for my own projects.
To resolve undefined symbols, you can also add a JAR onto the project build path using Project Properties > Java Build Path > Libraries > Add External JARs...
If you are getting your JARs through Maven / SBT, they should be in:
C:\Users\<name>\.ivy2\local\edu.berkeley.cs\chisel3_2.11\jars
If you are using publish-local with chisel3, your JARs should be in
C:\Users\<name>\.ivy2\cache\edu.berkeley.cs\chisel3_2.11\jars
Note that chisel3 is compiled into one JAR, including coreMacros and chiselFrontend sub-projects
Of course, this is a more quick-and-dirty solution compared to something that can parse SBT files.

eclipse(set with scala envirnment) : object apache is not a member of package org

As shown in image, its giving error when i am importing the Spark packages. Please help. When i hover there, it shows "object apache is not a member of package org".
I searched on this error, it shows spark jars has not been imported. So, i imported "spark-assembly-1.4.1-hadoop2.2.0.jar" too. But still same error.Below is what i actually want to run:
import org.apache.spark.{SparkConf, SparkContext}
object ABC {
def main(args: Array[String]){
//Scala Main Method
println("Spark Configuration")
val conf = new SparkConf()
conf.setAppName("My First Spark Scala Application")
conf.setMaster("spark://ip-10-237-224-94:7077")
println("Creating Spark Context")
}
}
Adding spark-core jar in your classpath should resolve your issue. Also if you are using some build tools like Maven or Gradle (if not then you should because spark-core has lot many dependencies and you would keep getting such problem for different jars), try to use Eclipse task provided by these tools to properly set classpath in your project.
I was also receiving the same error, in my case it was compatibility issue. As Spark 2.2.1 is not compatible with Scala 2.12(it is compatible with 2.11.8) and my IDE was supporting Scala 2.12.3.
I resolved my error by
1) Importing the jar files from the basic folder of Spark. During the installation of Spark in our C drive we have a folder named Spark which contains Jars folder in it. In this folder one can get all the basic jar files.
Goto to Eclipse right click on the project -> properties-> Java Build Path. Under 'library' category we will get an option of ADD EXTERNAL JARs.. Select this option and import all the jar files of 'jars folder'. click on Apply.
2) Again goto properties -> Scala Compiler ->Scala Installation -> Latest 2.11 bundle (dynamic)*
*before selecting this option one should check the compatibility of SPARK and SCALA.
The problem is Scala is NOT backward compatible. Hence each Spark module is complied against specific Scala library. But when we run from eclipse, we have one SCALA VERSION which was used to compile and create the spark Dependency Jar which we add to the build path, and SECOND SCALA VERSION is there as the eclipse run time environment. Both may conflict.
This is a hard reality, although, we wish Scala to be ,backward compatible. Or at least a complied jar file created could be backward compatible.
Hence, the recommendation is , use Maven or similar where dependency version can be managed.
If you are doing this in the context of Scala within a Jupyter Notebook, you'll get this error. You have to install the Apache Toree kernel:
https://github.com/apache/incubator-toree
and create your notebooks with that kernel.
You also have to start the Jupyter Notebook with:
pyspark

Using util.parsing in Scala 2.11

I have written a Scala program with Eclipse Scala IDE that uses scala.util.parsing.JSON and I would like to transform it to support Scala 2.11 version. In Scala 2.11 I get an error:
error: object parsing is not a member of package util.
I have found out that the parsing library is not anymore in the util package by default, but has to be downloaded separately here.
I have downloaded that and tried to add it to my Eclipse project as a new source folder, but that didn't work out. The instructions are only for adding it to sbt, but I don't think that is relevant to me if I want to just use it in Eclipse.
Should I try to find a JAR file somewhere?
Should I try to find a JAR file somewhere?
Yes, you should. :)
And specifically, you should use this one (in SBT syntax):
libraryDependencies += "org.scala-lang.modules" %% "scala-parser-combinators" % "1.0.2"
The above line should be all you need to add to build.sbt if you're using SBT. If you want to manually add it to your project by downloading it, you can find it on Maven Central.
The scala-parser-combinators library was removed in 2.11 so that people who don't use it don't have to pay a price for having it in the scala runtime library. Consequently, people who do want to use it have to now explicitly include it in their build. Note that the XML library was similarly removed in 2.11 to its own library.

Is the Akka Actors library installed with the Scala IDE for Scala 2.10?

I have recently begun exploring Scala, and have started by installing the Scala IDE in my copy of Eclipse (Indigo). I initially installed the Scala IDE for Scala 2.9, but then noticed that there was a newer release available for Scala 2.10. Installing the newer plug-in over the older one seems to have worked, but...
Scala 2.10 has deprecated the older Scala Actors in favor of Akka Actors. Thus I'm trying to add an import to my toy Scala project:
import akka.actor.Actor
This is flagged in the IDE with the error
not found: object akka
When I look at my Scala project's properties, I indeed do not see any of the akka-* jar files that are mentioned in the Akka documentation.
Do they need to be downloaded and installed separately, even though the Scala IDE plug-in installed the rest of Scala 2.10? Or have package names changed as part of integrating Akka actors in place of the older Scala Actors? (The documentation doesn't say so, but the Scala 2.10 release is fairly recent...)
No, they aren't packaged together.
The easiest way to make sure the Eclipse IDE can see your dependencies (Akka, and anything else referenced in your build.sbt file) is to let sbt do it using the sbteclipse plugin. Here's the instructions I wrote up for co-workers:
Install the "sbteclipse" plugin
This plugin will allow sbt to add the files/references that Eclipse needs to find all the dependencies that you specify in your build.sbt. Otherwise, you will be able to use the IDE, but you will seek all kinds of "object not found" errors.
Just make sure the plugin is being added in your global plugins.sbt file. This file (and it's path) may not exist so you may need to create it at the following location:
~/.sbt $ cd ~/.sbt/0.13/
~/.sbt/0.13 $ mkdir plugins
Edit/create the plugins.sbt file:
~/.sbt/0.13 $ vi plugins/plugins.sbt
then add this line (it may be the only line in the file):
addSbtPlugin("com.typesafe.sbteclipse" % "sbteclipse-plugin" % "2.5.0")
Running sbteclipse
To use this, you just navigate to a scala project on the commandline and run the following. If you already had Eclipse open, go ahead and restart it.
/sites/ewuser (master)$ sbt eclipse
References:
How to initialize a new Scala project in sbt, Eclipse and github
Official sbteclipse plugin
The Akka artifacts are not bundled with the Scala IDE (yet), you will have to add “akka-actor_2.10” and friends to your project’s dependencies.
Download the akka for eclipse from below location
http://downloads.typesafe.com/akka/akka_2.11-2.4.1.zip?_ga=1.167921254.618585520.1450199987
extract the zip
add dependencies from the lib folder into project

I can't use sbt.Process inside /src?

I'm currently using sbt to build and run my scala programs. I'm trying to use sbt.Process to execute system commands. I must be missing something because when I try to import sbt.Process in one of my files in src/ I get this error.
not found: value sbt
[error] import sbt.Process._
So it looks like I can't access the sbt package inside my src/ files. What do I need to do to access it? Thanks.
SBT's environment (v 0.7.x) is only available in your build file or a Plugin.
The easiest way to use sbt.Process library (until 0.9.x which will have Process as an independent library) is to copy (BSD License) Process.scala and ProcessImpl.scala into your project
There are different classpaths for running sbt and compiling your source files.
One classpath is for compilation of files in directory project/build (that one contains sbt jars and usually scala library 2.7.7) and the other one is for building source files of your project (that one contains your dependencies from lib and lib_managed and usually scala library 2.8.*). If you'd like to use sbt.Process in your source files you can do two things:
add sbt jar to lib or lib_managed for it to be available on your project's classpath
use snapshot version of scala 2.9, it would have sbt Process built-in as sys.process package
Wait for Scala 2.9, and then just use it out of scala.sys.process.
sbt package has became an integral part of the Scala standard library since version 2.9
...this API has been included in the Scala standard library for version 2.9.
quoted from sbt wiki
Here's the link (scroll down)
well, in order to use it, all you have to do (assuming you are using sbt for build), is to add in build.sbt file the following line of code: sbtPlugin := true it will add the needed dependencies to your project.
of course, this solution is only to get your imports with sbt package to work. you should refactor your code to use the new package scala.sys.process like Daniel C. Sobral suggested.