How to install sbt package for Jupyter notebook with Scala kernel? - scala

I am using Jupyter notebook with Scala kernel? How can I install sbt package and load it for the notebook?
It seems this https://www.scala-sbt.org/1.x/docs/Scripts.html is related.

late answer, but likely you would like to use Almond as a Scala kernel. Simply import sbt packages with ivy, e.g.:
import $ivy.`org.scalaz::scalaz-core:7.2.27`, scalaz._, Scalaz._

Related

EMR Notebook Scala kernel import graphframes library

Running spark-shell --packages "graphframes:graphframes:0.7.0-spark2.4-s_2.11" in the bash shell works and I can successfully import graphframes 0.7, but when I try to use it in a scala jupyter notebook like this:
import scala.sys.process._
"spark-shell --packages \"graphframes:graphframes:0.7.0-spark2.4-s_2.11\""!
import org.graphframes._
gives error message:
<console>:53: error: object graphframes is not a member of package org
import org.graphframes._
Which from what I can tell means that it runs the bash command, but then still cannot find the retrieved package.
I am doing this on an EMR Notebook running a spark scala kernel.
Do I have to set some sort of spark library path in the jupyter environment?
That simply shouldn't work. What your code does is a simple attempt to start a new independent Spark shell. Furthermore Spark packages have to loaded when the SparkContext is initialized for the first time.
You should either add (assuming these are correct versions)
spark.jars.packages graphframes:graphframes:0.7.0-spark2.4-s_2.11
to your Spark configuration files, or use equivalent in your SparkConf / SparkSessionBuilder.config before SparkSession is initialized.

How to integrate Jupyter notebook scala kernel with apache spark?

I have installed Scala kernel based on this doc: https://github.com/jupyter-scala/jupyter-scala
Kernel is there:
$ jupyter kernelspec list
Available kernels:
python3 /usr/local/homebrew/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/ipykernel/resources
scala /Users/bobyfarell/Library/Jupyter/kernels/scala
When I try to use Spark in the notebook I get this:
val sparkHome = "/opt/spark-2.3.0-bin-hadoop2.7"
val scalaVersion = scala.util.Properties.versionNumberString
import org.apache.spark.ml.Pipeline
Compilation Failed
Main.scala:57: object apache is not a member of package org
; import org.apache.spark.ml.Pipeline
^
I tried:
Setting SPARK_HOME and CLASSPATH to the location of $SPARK_HOME/jars
Setting -cp option pointing to $SPARK_HOME/jars in kernel.json
Setting classpath.add call before imports
None of these helped. Please note I don't want to use Toree, I want to use standalone spark and Scala kernel with Jupyter. A similar issue is reported here too: https://github.com/jupyter-scala/jupyter-scala/issues/63
It doesn't look like you are following the jupyter-scala directions for using Spark. You have to load spark into the kernel using the special imports.

How do I add Akka library to jupyter-scala so I can use it in the notebook?

I want to use Akka inside jupyter notebook running jupyter-scala kernel. How do I do this?
Use classpath.add() to include the Akka library you want to use, then, as with any Scala program, import the classes as needed. classpath.add() takes one or more arguments that resemble sbt library dependencies. For example, if you want to use the akka-actor library, enter the following in your notebook:
classpath.add("com.typesafe.akka" %% "akka-actor" % "2.5.4")
import akka.actor._
The below screenshot shows the use of the above commands in a Jupyter notebook with the jupyter-scala kernel installed:
If the classpath method doesnt work, you can add library using ivy:
import $ivy.`com.typesafe.akka::akka-actor:2.5.4`
Not sure if this is the best way, but it works.

How to put org.apache in Scala import path?

I have spark 1.6.0 installed and would like to import it in Scala 2.11. I can use spark-shell, which has org.apache in its path. How do I put it in my system installation of scala's path?

creating and using standalone scalaz jar without sbt

I've downloaded scalaz snapshot from repository (version 6.0.4).
I want to create standalone jar file and put it into my scala lib directory to use scalaz without sbt.
I'have scala package from scala-lang.org, and stored in /opt/scala
As far I did:
go to untared scalaz directory
run sbt from scalaz project
compile scalaz project
make a package (by package command)
sbt make a jar full/target/scala-2.9.1/scalaz-full_2.9.1-6.0.4-SNAPSHOT.jar
it also produce other jar: full/lib/sxr_2.9.0-0.2.7.jar
I moved both jars to /opt/scala/lib
After this I try scala repl and I can't import scalaz. I tried to import scalaz._, Scalaz._, org.scalaz._, scalaz-core._ and don't work.
REPL code completition after typing import scalaz it suggest: scalaz_2.9.1-6.0.4-SNAPSHOT.
But import scalaz_2.9.1-6.0.4-SNAPSHOT._ don't work
Any idea?
you can download scalaz and extract the jar that contains scalaz-core_2.9.1-6.0.3.jar. Or download scalaz-core directly.
then you can use : scala -cp scalaz-core_2.9.1-6.0.3.jar to launch the REPL finally import scalaz._ as expected.
If you want to use the jar produced by sbt, you can find it in core/target/scala-2.9.1/scalaz-core_2.9.1-6.0.4-SNAPSHOT.jar (you will also find source and javadoc packages in the same directory). Just put this file in your classpath (using scala -cp for example) and you will be able to import scalaz._
I think I know the problem.
scalaz-full_2.9.1-6.0.4-SNAPSHOT.jar is not a java jar class package, it's just a zip with scalaz project - so it contains not package - like directory tree (eg: directory names contains '.').
So to use it we need to unpack scalaz-full_2.9.1-6.0.4-SNAPSHOT.jar, and copy desired jars (eg: scalaz-core_2.9.1-6.0.4-SNAPSHOT.jar, scalaz-http_2.9.1-6.0.4-SNAPSHOT.jar ...) to lib directory.