Iam new with scala and datastax. I clone some sbt and java datastax object mapper but can't seem to make it work using maven and scala. When i issue a maven compile. Here's my repo https://github.com/gin-domeng/scala-datastax.
Related
I want to install Spark in a machine and I don't know if I need Scala as a dependency.
I have installed Java (1.8) as the documentation saids. But I don't know If I need any other dependencies
Scala is included in Spark's distribution, so for using Scala in Spark REPL you don't need to install Scala. Here are scala jars included.
If you are setting up IDE for Spark project then you will need Scala as dependency.
I have created spark scala(version 2.11) application and try to build using maven(version-3) using IntelliJ. At first time,able to compile and built the jar using maven successfully and able to test spark application using jar on cluster as well.Next time,I have modified some of the existing scala class code and tried to build again, code compiled and generate jar file successfully without any issues but there are no scala classes in latest jar file.I would like to know why maven build is not generating class file when I build.Can you please let me know what could be the problem and how Can I fix it ?
The easiest way to build scala applications for spark is to use SBT and fat jar plugin. Details were already described there:
How to build an Uber JAR (Fat JAR) using SBT within IntelliJ IDEA?
Just don't forget to exclude spark jars from fat jar with provided.
On Eclipse, while setting up spark , even after adding external jars to build path to spark-2.4.3-bin-hadoop2.7/jars/<_all.jar>,
Complier complains about '“object apache is not a member of package org''
Yes, Building dependencies via Maven or SBT would fix it. A question is asked
scalac compile yields "object apache is not a member of package org"
But Question over here is , WHY the traditional way is failing like this ?
If we reffer here , Scala/Spark version compatibility We could see a similar issue. The problem is Scala is NOT backward compatible. Hence each Spark module is complied against specific Scala library. But when we run from eclipse, the eclipse Scala environment may not be compatible that particular scala version of which we have the Spark libraries set up.
I'm very new to Scala and I tried to run Scala project under Eclipse. I used sbt to create one, then ran sbteclipse to prepare it for Eclipse and imported it successfully. However when I try to run it I get
Error: Unable to initialize main class Main
Caused by: java.lang.NoClassDefFoundError: scala/Function0
error. Scala, sbt and Java are installed, because when I try to run the same project via console, using sbt, it works. What am I missing?
Thanks for any help!
It looks like the classpath of your Eclipse project is incomplete: it's missing the Scala library. Can you double check in Project Settings that the scala library is present?
If all you want to do is try a simple program, an simpler solution is to create a New Scala Project using the Eclipse wizard.
As shown in image, its giving error when i am importing the Spark packages. Please help. When i hover there, it shows "object apache is not a member of package org".
I searched on this error, it shows spark jars has not been imported. So, i imported "spark-assembly-1.4.1-hadoop2.2.0.jar" too. But still same error.Below is what i actually want to run:
import org.apache.spark.{SparkConf, SparkContext}
object ABC {
def main(args: Array[String]){
//Scala Main Method
println("Spark Configuration")
val conf = new SparkConf()
conf.setAppName("My First Spark Scala Application")
conf.setMaster("spark://ip-10-237-224-94:7077")
println("Creating Spark Context")
}
}
Adding spark-core jar in your classpath should resolve your issue. Also if you are using some build tools like Maven or Gradle (if not then you should because spark-core has lot many dependencies and you would keep getting such problem for different jars), try to use Eclipse task provided by these tools to properly set classpath in your project.
I was also receiving the same error, in my case it was compatibility issue. As Spark 2.2.1 is not compatible with Scala 2.12(it is compatible with 2.11.8) and my IDE was supporting Scala 2.12.3.
I resolved my error by
1) Importing the jar files from the basic folder of Spark. During the installation of Spark in our C drive we have a folder named Spark which contains Jars folder in it. In this folder one can get all the basic jar files.
Goto to Eclipse right click on the project -> properties-> Java Build Path. Under 'library' category we will get an option of ADD EXTERNAL JARs.. Select this option and import all the jar files of 'jars folder'. click on Apply.
2) Again goto properties -> Scala Compiler ->Scala Installation -> Latest 2.11 bundle (dynamic)*
*before selecting this option one should check the compatibility of SPARK and SCALA.
The problem is Scala is NOT backward compatible. Hence each Spark module is complied against specific Scala library. But when we run from eclipse, we have one SCALA VERSION which was used to compile and create the spark Dependency Jar which we add to the build path, and SECOND SCALA VERSION is there as the eclipse run time environment. Both may conflict.
This is a hard reality, although, we wish Scala to be ,backward compatible. Or at least a complied jar file created could be backward compatible.
Hence, the recommendation is , use Maven or similar where dependency version can be managed.
If you are doing this in the context of Scala within a Jupyter Notebook, you'll get this error. You have to install the Apache Toree kernel:
https://github.com/apache/incubator-toree
and create your notebooks with that kernel.
You also have to start the Jupyter Notebook with:
pyspark