How can I ignore scala library while sbt assembly - scala

I am using sbt to build my scala project.
This is my build.sbt:
name := "My Spark App"
version := "1.0"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.2.0" % "provided"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "1.2.0" % "provided"
I am running sbt assembly to create an assembly jar, but I found a scala directory containing scala library class codes.
Is it possible to take scala library as a provided dependency, since the run-time environment already contains scala?

From docs, this might help
assemblyOption in assembly := (assemblyOption in assembly).value.copy(includeScala = false)

Related

Build a universal sbt file in apache spark

Hello i have a sbt file like this:
name := "Myexample"
version := "1.0"
scalaVersion := "2.11.12"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-sql" % "2.4.4",
"org.apache.spark" %% "spark-mllib" % "2.4.4"
)
I want to fix a universal sbt file. If other computer has other scala version or spark version can i make a sbt file dynamic, not declaring my own scalaVersion at all, but declaring in every computer their own scalaversion or apache spark version??

Building Spark scala project using Bazel

I want to build a Spark with Scala project using Bazel which has been built using SBT and executed successfully. This is how my build.sbt looks like
name := "ScalaWithSpark" version := "0.1" scalaVersion := "2.11.11"
//libraryDependencies += "org.apache.spark" %% "spark-core" % "1.6.0"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.0.0"
Basically I have gone through all the concepts of Bazel but still its like dots scattered for me I need help in connecting the dots Tq

Unresolved dependency generating jar with SBT

I'm developing a Spark process in Scala (Eclipse IDE) and runs fine in my local cluster, but when I try to compiled it with SBT that I installed on my pc I got a error (see picture).
My first doubt is why SBT try to compile with scala 2.12 if I explicitly set scalaVersion to 2.11.11 in my build.sbt. I tried installing other SBT versions with the same results, also in other PCs but not works. I need help to fix it.
scala_version(Spark) :2.11.11
sbt_version : 1.0.2
spark: 2.2
build.sbt
name := "Comple"
version := "1.0"
organization := "com.antonio.spark"
scalaVersion := "2.11.11"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "2.2.0" % "provided",
"org.apache.spark" %% "spark-sql" % "2.2.0" % "provided"
)
assembly.sbt
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "1.0.2")
Error:
ResolveException: unresolved dependency: sbt_assembly;1.0.2: not found
Try changing your assembly.sbt file to:
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.5")
as stated in the documentation here: https://github.com/sbt/sbt-assembly
I recently used that with spark-core_2.11 version 2.2.0 and it worked.

IntelliJ Idea 14: cannot resolve symbol spark

I made a dependency of Spark which worked in my first project. But when I try to make a new project with Spark, my SBT does not import the external jars of org.apache.spark. Therefore IntelliJ Idea gives the error that it "cannot resolve symbol".
I already tried to make a new project from scratch and use auto-import but none works. When I try to compile I get the messages that "object apache is not a member of package org". My build.sbt looks like this:
name := "hello"
version := "1.0"
scalaVersion := "2.11.7"
libraryDependencies += "org.apache.spark" % "spark-parent_2.10" % "1.4.1"
I have the impression that there might be something wrong with my SBT settings, although it already worked one time. And except for the external libraries everything is the same...
I also tried to import the pom.xml file of my spark dependency but that also doesn't work.
Thank you in advance!
This worked for me->
name := "ProjectName"
version := "0.1"
scalaVersion := "2.11.11"
libraryDependencies ++= Seq(
"org.apache.spark" % "spark-core_2.11" % "2.2.0",
"org.apache.spark" % "spark-sql_2.11" % "2.2.0",
"org.apache.spark" % "spark-mllib_2.10" % "1.1.0"
)
I use
scalaVersion := "2.11.7"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.4.1"
in my build.sbt and it works for me.
I had a similar problem. It seems the reason was that the build.sbt file was specifying the wrong version of scala.
If you run spark-shell it'll say at some point the scala version used by Spark, e.g.
Using Scala version 2.11.8
Then I edited the line in the build.sbt file to point to that version and it worked.
Currently spark-cassandra-connector compatible with Scala 2.10 and 2.11.
It worked for me when I updated the scala version of my project like below:
ThisBuild / scalaVersion := "2.11.12"
and I updated my dependency like:
libraryDependencies += "com.datastax.spark" %% "spark-cassandra-connector" % "2.4.0",
If you use "%%", sbt will add your project’s binary Scala version to the artifact name.
From sbt run:
sbt> reload
sbt> compile
Your library dependecy conflicts with with the scala version you're using, you need to use 2.11 for it to work. The correct dependency would be:
scalaVersion := "2.11.7"
libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "1.4.1"
note that you need to change spark_parent to spark_core
name := "SparkLearning"
version := "0.1"
scalaVersion := "2.12.3"
// additional libraries
libraryDependencies += "org.apache.spark" % "spark-streaming_2.10" % "1.4.1"

Cannot run jar file created from Scala file

This the code that I have written in Scala.
object Main extends App {
println("Hello World from Scala!")
}
This is my build.sbt.
name := "hello-world"
version := "1.0"
scalaVersion := "2.11.5"
mainClass := Some("Main")
This is the command that I have run to create the jar file.
sbt package
My problem is that a jar file named hello-world_2.11-1.0.jar has been created at target/scala-2.11. But I cannot run the file. It is giving me an error saying NoClassDefFoundError.
What am I doing wrong?
It also says what class is not found. Most likely you aren't including scala-library.jar. You can run scala target/scala-2.11/hello-world_2.11-1.0.jar if you have Scala 2.11 available from the command line or java -cp "<path to scala-library.jar>:target/scala-2.11/hello-world_2.11-1.0.jar" Main (use ; instead of : on Windows).
The procedure depicted proves valid up to the way the jar file is executed. From target/scala-2.11 try running it with
scala hello-world_2.11-1.0.jar
Check whether it is runnable also from the project root folder with sbt run.
To run the jar file(containing scala code) with multiple main classes use following approach
scala -cp "<jar-file>.jar;<other-dependencies>.jar" com.xyz.abc.TestApp
This command will take care of including scala-library.jar in dependency and will also identify TestApp as main class if it has a def main(args:Array[String]) method. Please note that multiple jar files should be separated by semi-colon(";")
We can use sbt-assembly to package and run the application.
First, create or add the plugin to project/plugins.sbt
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.9")
The sample build.sbt looks like below:
name := "coursera"
version := "0.1"
scalaVersion := "2.12.10"
mainClass := Some("Main")
val sparkVersion = "3.0.0-preview2"
val playVersion="2.8.1"
val jacksonVersion="2.10.1"
libraryDependencies ++= Seq(
"org.scala-lang" % "scala-library" % scalaVersion.toString(),
"org.apache.spark" %% "spark-streaming" % sparkVersion,
"org.apache.spark" %% "spark-core" % sparkVersion,
"org.apache.spark" %% "spark-sql" % sparkVersion,
"com.typesafe.play" %% "play-json" % playVersion,
// https://mvnrepository.com/artifact/org.apache.spark/spark-streaming-kafka-0-10
"org.apache.spark" %% "spark-streaming-kafka-0-10" % sparkVersion,
// https://mvnrepository.com/artifact/org.mongodb/casbah
"org.mongodb" %% "casbah" % "3.1.1" pomOnly(),
// https://mvnrepository.com/artifact/com.typesafe/config
"com.typesafe" % "config" % "1.2.1"
)
assemblyMergeStrategy in assembly := {
case PathList("META-INF", xs # _*) => MergeStrategy.discard
case x => MergeStrategy.first
}
From console, we can run sbt assembly and the jar file gets created in target/scala-2.12/ path.
sbt assembly will create a fat jar. Here is an excerpt from the documentation :
sbt-assembly is a sbt plugin originally ported from codahale's assembly-sbt, which I'm guessing was inspired by Maven's assembly plugin. The goal is simple: Create a fat JAR of your project with all of its dependencies.