How to add dependency files to Scala? - eclipse

I'm new to Scala and Spark and and started writing a simple Apache Spark program in Scala IDE (in Eclipse). I added the dependency jar files to my project as I usually do in my java project but it can't recognize them and give me the following error message object apache is not a member of package org. How should I add the dependency jar files?
The jar files I'm adding are the ones exist under 'lib' directory where Spark in installed.

For scala you use SBT as a dependency manager and code compiler.
More information on how to set it up here:
http://www.scala-sbt.org/release/tutorial/Setup.html
However your build file will look something like this:
name := "Test"
version := "1.0"
scalaVersion := "2.10.4"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "1.3.0"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.3.0"

Related

Spark with IntelliJ or Eclipse

I am trying to setup IntelliJ for spark 2.11 but it is very daunting and after days I have not been able to compile a simple instruction such as with "spark.read.format" which is not found in main core and sql spark libraries.
I have seen a few posts on the subject but with none resolved. Does anyone have some experience with perhaps a working sample program I can start with?
Could it be that it would be easier with Eclipse?
Many thanks in advance for your answers,
EZ
build project in Intellij using with scala 2.11 and sbt 0.13: then ensure that your plugins.sbt contains as below:
logLevel := Level.Warn
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.13.0")
then your build.sbt must contain as below:
scalaVersion := "2.11.8"
val sparkVersion = "2.1.0"
libraryDependencies += "org.apache.spark" %% "spark-core" % sparkVersion %"provided"
libraryDependencies += "org.apache.spark" %% "spark-sql" % sparkVersion %"provided"
Then write your code, click Terminal in Intellij and type sbt assembly: you can ship that jar to remote cluster, otherwise run from Intelij locally, let me know how it goes

sryza/spark-timeseries: NoSuchMethodError: scala.runtime.IntRef.create(I)Lscala/runtime/IntRef;

I have a Scala project that I build with sbt. It uses the sryza/spark-timeseries library.
I am trying to run the following simple code:
val tsAirPassengers = new DenseVector(Array(
112.0,118.0,132.0,129.0,121.0,135.0,148.0,148.0,136.0,119.0,104.0,118.0,115.0,126.0,
141.0,135.0,125.0,149.0,170.0,170.0,158.0,133.0,114.0,140.0,145.0,150.0,178.0,163.0,
172.0,178.0,199.0,199.0,184.0,162.0,146.0,166.0,171.0,180.0,193.0,181.0,183.0,218.0,
230.0,242.0,209.0,191.0,172.0,194.0,196.0,196.0,236.0,235.0,229.0,243.0,264.0,272.0,
237.0,211.0,180.0,201.0,204.0,188.0,235.0,227.0,234.0,264.0,302.0,293.0,259.0,229.0,
203.0,229.0,242.0,233.0,267.0,269.0,270.0,315.0,364.0,347.0,312.0,274.0,237.0,278.0,
284.0,277.0,317.0,313.0,318.0,374.0,413.0,405.0,355.0,306.0,271.0,306.0,315.0,301.0,
356.0,348.0,355.0,422.0,465.0,467.0,404.0,347.0,305.0,336.0,340.0,318.0,362.0,348.0,
363.0,435.0,491.0,505.0,404.0,359.0,310.0,337.0,360.0,342.0,406.0,396.0,420.0,472.0,
548.0,559.0,463.0,407.0,362.0,405.0,417.0,391.0,419.0,461.0,472.0,535.0,622.0,606.0,
508.0,461.0,390.0,432.0
))
val period = 12
val model = HoltWinters.fitModel(tsAirPassengers, period, "additive", "BOBYQA")
It builds fine, but when I try to run it, I get this error:
Exception in thread "main" java.lang.NoSuchMethodError: scala.runtime.IntRef.create(I)Lscala/runtime/IntRef;
at com.cloudera.sparkts.models.HoltWintersModel.convolve(HoltWinters.scala:252)
at com.cloudera.sparkts.models.HoltWintersModel.initHoltWinters(HoltWinters.scala:277)
at com.cloudera.sparkts.models.HoltWintersModel.getHoltWintersComponents(HoltWinters.scala:190)
.
.
.
The error occurs on this line:
val model = HoltWinters.fitModel(tsAirPassengers, period, "additive", "BOBYQA")
My build.sbt includes:
name := "acme-project"
version := "0.0.1"
scalaVersion := "2.10.5"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-hive" % "1.6.0",
"net.liftweb" %% "lift-json" % "2.5+",
"com.github.seratch" %% "awscala" % "0.3.+",
"org.apache.spark" % "spark-mllib_2.10" % "1.6.2"
)
I have placed sparkts-0.4.0-SNAPSHOT.jar in the lib folder of my project. (I would have preferred to add a libraryDependency, but spark-ts does not appear to be on Maven Central.)
What is causing this run-time error?
The library requires Scala 2.11, not 2.10, and Spark 2.0, not 1.6.2, as you can see from
<scala.minor.version>2.11</scala.minor.version>
<scala.complete.version>${scala.minor.version}.8</scala.complete.version>
<spark.version>2.0.0</spark.version>
in pom.xml. You can try changing these and seeing if it still compiles, find which older version of sparkts is compatible with your versions, or update your project's Scala and Spark versions (don't miss spark-mllib_2.10 in this case).
Also, if you put the jar into lib folder, you also have to put its dependencies there (and their dependencies, etc.) or into libraryDependencies. Instead, publish sparkts into your local repository using mvn install (IIRC) and add it to libraryDependencies, which will allow SBT to resolve its dependencies.

How to add Java dependencies to Scala projects's sbt file

I have a spark streaming Scala project which uses Apache NiFi receiver. The projects runs fine under Eclipse/Scala IDE and now I want to package it for deployment now.
When I add it as
libraryDependencies += "org.apache.nifi" %% "nifi-spark-receiver" % "0.3.0"
sbt assumes it's a Scala library and tries to resolve it.
How doe I add NiFi receiver and all it's dependencies to project's SBT file?
Also, is it possible to pint dependencies to local directories instead of sbt trying to resolve?
Thanks in advance.
Here is my sbt file contents:
name := "NiFi Spark Test"
version := "1.0"
scalaVersion := "2.10.5"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.5.2" % "provided"
libraryDependencies += "org.apache.nifi" %% "nifi-spark-receiver" % "0.3.0"
libraryDependencies += "org.apache.nifi" %% "nifi-spark-receiver" % "0.3.0"
Double % are used for adding scala version as suffix to the maven artefact. It is required because different scala compiler versions produces incompatible bytecode. If you are would like to use java library from maven, then you should use single % character
libraryDependencies += "org.apache.nifi" % "nifi-spark-receiver" % "0.3.0"
I also found that I can put libraries the project depends on into the lib folder and they will be picked up during assembly.

Cannot run jar file created from Scala file

This the code that I have written in Scala.
object Main extends App {
println("Hello World from Scala!")
}
This is my build.sbt.
name := "hello-world"
version := "1.0"
scalaVersion := "2.11.5"
mainClass := Some("Main")
This is the command that I have run to create the jar file.
sbt package
My problem is that a jar file named hello-world_2.11-1.0.jar has been created at target/scala-2.11. But I cannot run the file. It is giving me an error saying NoClassDefFoundError.
What am I doing wrong?
It also says what class is not found. Most likely you aren't including scala-library.jar. You can run scala target/scala-2.11/hello-world_2.11-1.0.jar if you have Scala 2.11 available from the command line or java -cp "<path to scala-library.jar>:target/scala-2.11/hello-world_2.11-1.0.jar" Main (use ; instead of : on Windows).
The procedure depicted proves valid up to the way the jar file is executed. From target/scala-2.11 try running it with
scala hello-world_2.11-1.0.jar
Check whether it is runnable also from the project root folder with sbt run.
To run the jar file(containing scala code) with multiple main classes use following approach
scala -cp "<jar-file>.jar;<other-dependencies>.jar" com.xyz.abc.TestApp
This command will take care of including scala-library.jar in dependency and will also identify TestApp as main class if it has a def main(args:Array[String]) method. Please note that multiple jar files should be separated by semi-colon(";")
We can use sbt-assembly to package and run the application.
First, create or add the plugin to project/plugins.sbt
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.9")
The sample build.sbt looks like below:
name := "coursera"
version := "0.1"
scalaVersion := "2.12.10"
mainClass := Some("Main")
val sparkVersion = "3.0.0-preview2"
val playVersion="2.8.1"
val jacksonVersion="2.10.1"
libraryDependencies ++= Seq(
"org.scala-lang" % "scala-library" % scalaVersion.toString(),
"org.apache.spark" %% "spark-streaming" % sparkVersion,
"org.apache.spark" %% "spark-core" % sparkVersion,
"org.apache.spark" %% "spark-sql" % sparkVersion,
"com.typesafe.play" %% "play-json" % playVersion,
// https://mvnrepository.com/artifact/org.apache.spark/spark-streaming-kafka-0-10
"org.apache.spark" %% "spark-streaming-kafka-0-10" % sparkVersion,
// https://mvnrepository.com/artifact/org.mongodb/casbah
"org.mongodb" %% "casbah" % "3.1.1" pomOnly(),
// https://mvnrepository.com/artifact/com.typesafe/config
"com.typesafe" % "config" % "1.2.1"
)
assemblyMergeStrategy in assembly := {
case PathList("META-INF", xs # _*) => MergeStrategy.discard
case x => MergeStrategy.first
}
From console, we can run sbt assembly and the jar file gets created in target/scala-2.12/ path.
sbt assembly will create a fat jar. Here is an excerpt from the documentation :
sbt-assembly is a sbt plugin originally ported from codahale's assembly-sbt, which I'm guessing was inspired by Maven's assembly plugin. The goal is simple: Create a fat JAR of your project with all of its dependencies.

Can't install Scaladoc with SBT and Intellij

I am new to scala and am currently trying to setup IntelliJ IDEA 13.1 with the Scala plugin. It has support for SBT. I have simply followed the basic tutorial for creating a new project for SBT here: http://confluence.jetbrains.com/display/IntelliJIDEA/Getting+Started+with+SBT
Currently my build.sbt file is:
name := "scalasandpit"
version := "1.0"
scalaVersion := "2.10"
libraryDependencies += "org.scalatest" % "scalatest_2.10" % "2.1.0" % "test"
autoAPIMappings := true
This pulls down various jar binaries, but no sources and no javadoc. I wondered if there is a way to have both sources and javadoc work with IntelliJ and SBT. I think I'm missing something.
There seem to be two issues: getting sbt to pull down sources and docs, and then getting Idea to show them to you. To solve the former problem see the sbt documentation -- about half way down there's a section called "Download Sources" which tells you what to add to your build.sbt:
libraryDependencies +=
"org.scalatest" % "scalatest_2.10" % "2.1.0" % "test" withSources() withJavadoc()