SparkSession version trouble in Scala - scala

I've read this article and copipasted the Sort-Merge Join Example, but when I'm trying to build the project I'm getting the following error:
object SparkSession is not a member of package org.apache.spark.sql
I've seen many questions about this error, and the answers were that they used an old version of Spark. However I mentioned in build.sbt version 2.1 of Spark as they use in the example on that website.
Here is my build.sbt:
name := "Simple Project"
version := "1.0"
scalaVersion := "2.11.8"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.1.0"
What am I missing?

Spark SQL dependency is missing
libraryDependencies += "org.apache.spark" %% "spark-sql" % 2.1.0

Related

Build a universal sbt file in apache spark

Hello i have a sbt file like this:
name := "Myexample"
version := "1.0"
scalaVersion := "2.11.12"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-sql" % "2.4.4",
"org.apache.spark" %% "spark-mllib" % "2.4.4"
)
I want to fix a universal sbt file. If other computer has other scala version or spark version can i make a sbt file dynamic, not declaring my own scalaVersion at all, but declaring in every computer their own scalaversion or apache spark version??

Building Spark scala project using Bazel

I want to build a Spark with Scala project using Bazel which has been built using SBT and executed successfully. This is how my build.sbt looks like
name := "ScalaWithSpark" version := "0.1" scalaVersion := "2.11.11"
//libraryDependencies += "org.apache.spark" %% "spark-core" % "1.6.0"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.0.0"
Basically I have gone through all the concepts of Bazel but still its like dots scattered for me I need help in connecting the dots Tq

Unresolve Dependencies Spark library

I followed this link : http://blog.miz.space/tutorial/2016/08/30/how-to-integrate-spark-intellij-idea-and-scala-install-setup-ubuntu-windows-mac/
When I try to compile my project with Intellij, sbt is complaining about unresolved dependencies
[Warn] ===public: tried [Warn]
https://repol.maven.org/maven2/org/apache/spark/spark-core/2.1.1/spark-core-2.1.1.pom
[Warn] Unresolved dependencies path: org.apache.spark:spark-core:2.1.1
My scala version is 2.12.2 and sparkVersion is 2.1.1
Here is what my build.sbt look like :
name := "test" version := "1.0" scalaVersion := "2.12.2"
val sparkVersion = "2.1.1"
libraryDependencies ++= Seq("org.apache.spark" % "spark-core" & sparkVersion)`
thank you
your last line should be
libraryDependencies += "org.apache.spark" % "spark-core_2.10" % sparkVersion
Or
libraryDependencies ++= Seq("org.apache.spark" % "spark-core_2.10" % sparkVersion)
And its better not to use scalaVersion := "2.12.2" as mentioned by #Nonontb and #Cyrelle. Please reduce the version that spark supports for better performance and to avoid unexpected errors.
the following line is from Spark docs
For the Scala API, Spark 2.1.1 uses Scala 2.11. You will need to use a compatible Scala version (2.11.x).

IntelliJ Idea 14: cannot resolve symbol spark

I made a dependency of Spark which worked in my first project. But when I try to make a new project with Spark, my SBT does not import the external jars of org.apache.spark. Therefore IntelliJ Idea gives the error that it "cannot resolve symbol".
I already tried to make a new project from scratch and use auto-import but none works. When I try to compile I get the messages that "object apache is not a member of package org". My build.sbt looks like this:
name := "hello"
version := "1.0"
scalaVersion := "2.11.7"
libraryDependencies += "org.apache.spark" % "spark-parent_2.10" % "1.4.1"
I have the impression that there might be something wrong with my SBT settings, although it already worked one time. And except for the external libraries everything is the same...
I also tried to import the pom.xml file of my spark dependency but that also doesn't work.
Thank you in advance!
This worked for me->
name := "ProjectName"
version := "0.1"
scalaVersion := "2.11.11"
libraryDependencies ++= Seq(
"org.apache.spark" % "spark-core_2.11" % "2.2.0",
"org.apache.spark" % "spark-sql_2.11" % "2.2.0",
"org.apache.spark" % "spark-mllib_2.10" % "1.1.0"
)
I use
scalaVersion := "2.11.7"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.4.1"
in my build.sbt and it works for me.
I had a similar problem. It seems the reason was that the build.sbt file was specifying the wrong version of scala.
If you run spark-shell it'll say at some point the scala version used by Spark, e.g.
Using Scala version 2.11.8
Then I edited the line in the build.sbt file to point to that version and it worked.
Currently spark-cassandra-connector compatible with Scala 2.10 and 2.11.
It worked for me when I updated the scala version of my project like below:
ThisBuild / scalaVersion := "2.11.12"
and I updated my dependency like:
libraryDependencies += "com.datastax.spark" %% "spark-cassandra-connector" % "2.4.0",
If you use "%%", sbt will add your project’s binary Scala version to the artifact name.
From sbt run:
sbt> reload
sbt> compile
Your library dependecy conflicts with with the scala version you're using, you need to use 2.11 for it to work. The correct dependency would be:
scalaVersion := "2.11.7"
libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "1.4.1"
note that you need to change spark_parent to spark_core
name := "SparkLearning"
version := "0.1"
scalaVersion := "2.12.3"
// additional libraries
libraryDependencies += "org.apache.spark" % "spark-streaming_2.10" % "1.4.1"

Build issues for apache spark

I have been having a sbt dependency issue when I try to build my apache spark project. I have Apache Spark 1.3.1.
My .sbt file is this:
name := "Transaction"
version := "1.0"
scalaVersion := "2.10.4"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.3.1"
resolvers ++= Seq(
"Akka Repository" at "http://repo.akka.io/releases/",
"Spray Repository" at "http://repo.spray.cc/")
And I keep getting this error:
[error] (*:update) sbt.ResolveException: unresolved dependency: org.apache.spark#spark-core_2.10;1.3.1: not found
I have looked all over and this seems to be a persistent issue, but no one has really solved it.
Thanks for your help!
I used
“org.apache.spark” % “spark-core_2.10” % “1.3.1”
instead of
“org.apache.spark” %% “spark-core” % “1.3.1”
and it worked!
EDIT:
However, I was able to get the latter statement to work after specifically making my scalaVersion 2.10 so:
scalaVersion := "2.10"
Probably because it tries to look for a specific 2.10.4 jar which doesn't exist.
I actually figured it out. You just need to put "provided"after the spark version.
name := "Transaction Fraud"
version := "1.0"
scalaVersion := "2.10.4"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.3.1" % "provided"
resolvers ++= Seq(
"Akka Repository" at "http://repo.akka.io/releases/",
"Spray Repository" at "http://repo.spray.cc/")