How to put org.apache in Scala import path? - scala

I have spark 1.6.0 installed and would like to import it in Scala 2.11. I can use spark-shell, which has org.apache in its path. How do I put it in my system installation of scala's path?

Related

Why I am not able to find spark.implicits._ in scala REPL / Shell?

I am on MacOS.
scala> import spark.implicits._
^
error: not found: value spark
WHY?
scala
Welcome to Scala 2.13.2 (OpenJDK 64-Bit Server VM, Java 13.0.2).
Type in expressions for evaluation. Or try :help.
java -version
openjdk version "14.0.1" 2020-04-14
OpenJDK Runtime Environment (build 14.0.1+7)
OpenJDK 64-Bit Server VM (build 14.0.1+7, mixed mode, sharing)
How to solve this problem?
If I try spark-shell
scala> import spark.implicits._
Works fine.
If you're running the scala repl you have to add the jar to the classpath by executing:
:require <path-to-jar>
Importation in Scala is a mechanism that enables more direct reference of different entities such as packages, classes, objects, instances, fields and methods.
import spark.implicits._ are provided by the Spark includes and various libraries for the Spark api, program, run-time, classes whatever you want to call it.
scala inside Spark is simply the interface for using spark-shell, it can be java, scala or python known as pyspark then.
scala on its own has no notion of where to get that specific import from, i.e. not in the scala supplied binaries, apis, methods, classes, libraries.
If using tools like SBT, IntelliJ then you provide dependencies that allow that import to be found, resolved.

Am getting error in scala programming while integrating spark streaming with kafka

Am trying to add some imports , but when I add
import org.apache.spark.streaming.kafka.kafkautils its showing the below error
object kafka is not a member of the object org.apache.spark.streaming.kafka.kafkautils
working on eclipse with
scala ide 4.7
version 2.11.11,
spark-2.3.0-bin-hadoop2.7 jar files,
kafka 2.11 jars,
spark-streaming-kafka-0-10_2.11-2.3.0 jar
If you are using spark-streaming-kafka-0-10_2.11-2.3.0 jar then the KafkaUtils is available in org.apache.spark.streaming.kafka010 this package.
So import
import org.apache.spark.streaming.kafka010.KafkaUtils
and not
import org.apache.spark.streaming.kafka.kafkautils
Hope this hepls!

How to integrate Jupyter notebook scala kernel with apache spark?

I have installed Scala kernel based on this doc: https://github.com/jupyter-scala/jupyter-scala
Kernel is there:
$ jupyter kernelspec list
Available kernels:
python3 /usr/local/homebrew/Cellar/python3/3.6.4_2/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/ipykernel/resources
scala /Users/bobyfarell/Library/Jupyter/kernels/scala
When I try to use Spark in the notebook I get this:
val sparkHome = "/opt/spark-2.3.0-bin-hadoop2.7"
val scalaVersion = scala.util.Properties.versionNumberString
import org.apache.spark.ml.Pipeline
Compilation Failed
Main.scala:57: object apache is not a member of package org
; import org.apache.spark.ml.Pipeline
^
I tried:
Setting SPARK_HOME and CLASSPATH to the location of $SPARK_HOME/jars
Setting -cp option pointing to $SPARK_HOME/jars in kernel.json
Setting classpath.add call before imports
None of these helped. Please note I don't want to use Toree, I want to use standalone spark and Scala kernel with Jupyter. A similar issue is reported here too: https://github.com/jupyter-scala/jupyter-scala/issues/63
It doesn't look like you are following the jupyter-scala directions for using Spark. You have to load spark into the kernel using the special imports.

Eclipse says "apache is not a member of package org"

I am using Scala on Eclipse Luna and trying to connect to Cassandra. My code is showing the error apache object is not a member of package org on the following line:
import org.apache.spark.SparkConf
I already imported the Scala and Spark libraries into the project. Does someone know how can I make my program import Spark libraries?

creating and using standalone scalaz jar without sbt

I've downloaded scalaz snapshot from repository (version 6.0.4).
I want to create standalone jar file and put it into my scala lib directory to use scalaz without sbt.
I'have scala package from scala-lang.org, and stored in /opt/scala
As far I did:
go to untared scalaz directory
run sbt from scalaz project
compile scalaz project
make a package (by package command)
sbt make a jar full/target/scala-2.9.1/scalaz-full_2.9.1-6.0.4-SNAPSHOT.jar
it also produce other jar: full/lib/sxr_2.9.0-0.2.7.jar
I moved both jars to /opt/scala/lib
After this I try scala repl and I can't import scalaz. I tried to import scalaz._, Scalaz._, org.scalaz._, scalaz-core._ and don't work.
REPL code completition after typing import scalaz it suggest: scalaz_2.9.1-6.0.4-SNAPSHOT.
But import scalaz_2.9.1-6.0.4-SNAPSHOT._ don't work
Any idea?
you can download scalaz and extract the jar that contains scalaz-core_2.9.1-6.0.3.jar. Or download scalaz-core directly.
then you can use : scala -cp scalaz-core_2.9.1-6.0.3.jar to launch the REPL finally import scalaz._ as expected.
If you want to use the jar produced by sbt, you can find it in core/target/scala-2.9.1/scalaz-core_2.9.1-6.0.4-SNAPSHOT.jar (you will also find source and javadoc packages in the same directory). Just put this file in your classpath (using scala -cp for example) and you will be able to import scalaz._
I think I know the problem.
scalaz-full_2.9.1-6.0.4-SNAPSHOT.jar is not a java jar class package, it's just a zip with scalaz project - so it contains not package - like directory tree (eg: directory names contains '.').
So to use it we need to unpack scalaz-full_2.9.1-6.0.4-SNAPSHOT.jar, and copy desired jars (eg: scalaz-core_2.9.1-6.0.4-SNAPSHOT.jar, scalaz-http_2.9.1-6.0.4-SNAPSHOT.jar ...) to lib directory.