Locate Scala installation for Spark - scala

I'm using EMR and can launch spark-shell, but I want to run Scala REPL. Currently when I type scala command on shell it says:
-bash: scala: command not found
How to locate and run Scala REPL give that Spark is already installed and configured?

As already told by #ernest_k, EMR doesn't come with Scala installation.
With that said, I used the steps mentioned in this blog to install Scala on EMR
# perform customary yum update
sudo yum update
# download Scala installer RPM
wget https://downloads.lightbend.com/scala/2.13.0-M5/scala-2.13.0-M5.rpm
# install Scala RPM
sudo yum install scala-2.12.5.rpm

Related

sbt installation does not set up scala; cant run scala files

I followed instructions on this site because I didn't want to work with intellij.
With this installation, scala is not available as a command. How would I go about running .scala files?
Installing sbt doesn't get scala REPL for you. If you have sbt in your PATH variable, then you can use sbt console command to do and verify simple scala commands.
Otherwise you need to install scala separately.
The easy way to install scala and sbt is to use sdkman. Follow steps here.
curl -s "https://get.sdkman.io" | bash
source "$HOME/.sdkman/bin/sdkman-init.sh"
sdk install sbt
sdk install scala

sudo apt-get install scala not found on ubuntu

I am completely stumped by why this simple sudo install is not working.
I am running ubuntu 16.04 on virtualbox on windows, and when i run the command
sudo apt-get install scala
command, I get the E:Unable to locate package scala error.
I have sudo apt-get update beforehand, and sudo apt-get install default-jdk worked
perfectly fine.
Does anyone have any idea why my ubuntu can't find the scala package?
There is a way to apt-get Scala, but you have to add the bintray repository to your sources file.
deb https://dl.bintray.com/sbt/debian / # access to Scala deb packages
I recommend against installing Scala with apt install scala, as the versions can be out of date (on Ubuntu 18.04 I got v2.11, while v2.13 is available). Instead go to the Scala webpage and install from there. If you want to be able to run
$ scala
and get the REPL, one option is to look for the .deb package on the install page and use that

How to build Spark from the sources from the Download Spark page?

I tried to install and build Spark 2.0.0 on Ubuntu VM with Ubuntu 16.04 as follows:
Install Java
sudo apt-add-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer
Install Scala
Go to their Downloads tab on their site: scala-lang.org/download/all.html
I used Scala 2.11.8.
sudo mkdir /usr/local/src/scala
sudo tar -xvf scala-2.11.8.tgz -C /usr/local/src/scala/
Modify the .bashrc file and include the path for scala:
export SCALA_HOME=/usr/local/src/scala/scala-2.11.8
export PATH=$SCALA_HOME/bin:$PATH
then type:
. .bashrc
Install git
sudo apt-get install git
Download and build spark
Go to: http://spark.apache.org/downloads.html
Download Spark 2.0.0 (Build from Source - for standalone mode).
tar -xvf spark-2.0.0.tgz
cd into the Spark folder (that has been extracted).
now type:
./build/sbt assembly
After its done Installing, I get the message:
[success] Total time: 1940 s, completed...
followed by date and time...
Run Spark shell
bin/spark-shell
That's when all hell breaks loose and I start getting the error. I go into the assembly folder to look for a folder called target. But there's no such folder there. The only things visible in assembly are: pom.xml, README, and src.
I looked it up online for quite a while and I haven't been able to find a single concrete solution that would help solve the error. Can someone please provide explicit step-by-step instructions as to how to go about solving this ?!? It's driving me nuts now... (T.T)
Screenshot of the error:
For some reason, Scala 2.11.8 is not working well while building but if I switch over to Scala 2.10.6 then it builds properly. I guess the reason I would need Scala in the first place is to get access to sbt to be able to build spark. Once its built, I need to direct myself to the spark folder and type:
build/sbt package
This will build the missing JAR files for me using Scala 2.11... kinda weird but that's how its working (I am assuming by looking at the logs).
Once spark builds again, type: bin/spark-shell (while being in the spark folder) and you'll have access to the spark shell.
type sbt package in spark directory not in build directory.
If your goal is really to build your custom Spark package from the sources you've downloaded from http://spark.apache.org/downloads.html, you should do the following instead:
./build/mvn -Phadoop-2.7,yarn,mesos,hive,hive-thriftserver -DskipTests clean install
You may want to read the official document Building Spark.
NB You don't have to install Scala and git packages to build Spark so you could have skipped "2. Install Scala" and "3. Install git" steps.

How do I install Scala in Jupyter IPython Notebook?

Here's a few links that I went to and did exactly what they said. I don't know what I'm doing wrong.
https://github.com/alexarchambault/jupyter-scala
https://github.com/ipython/ipython/wiki/IPython-kernels-for-other-languages
https://github.com/apache/incubator-toree
http://jcrudy.github.io/blog/html/2013/12/08/introduction_to_iscala.html
None of this is working. It may be some way that my node is configured. I just don't know. Please help.
I tried the following with Jupyterhub notebook and it works seamlessly:
# Step 1: Install spylon kernel
pip install spylon-kernel
# Step 2: create a kernel spec
python -m spylon_kernel install
# Step 3: start jupyter notebook
jupyter notebook
PS: to list all installed kernels, you can run the following command:
jupyter kernelspec list
You can use the information given here.
Ensure you have IPython 3 installed. ipython --version should return a
value >= 3.0. If it's not the case, a quick way of setting it up
consists in installing the Anaconda Python distribution, and then
running
$ pip install --upgrade "ipython[all]"
ipython --version should then return a value >= 3.0.
Download the Jupyter Scala binaries for Scala 2.10 (txz or zip) or
Scala 2.11 (txz or zip), and unpack them in a safe place. Then run
once the jupyter-scala program (or jupyter-scala.bat on Windows) it
contains. That will set-up the Jupyter Scala kernel for the current
user.
Check that Jupyter/IPython knows about Jupyter Scala by running
$ jupyter kernelspec list
This should print, among others, a line like
scala211
(or scala210 dependending on the Scala version you chose).
Then run either IPython console with
$ ipython console --kernel scala211
and start using the Jupyter Scala kernel straightaway, or run Jupyter
Notebook with
$ jupyter notebook
and create Scala 2.11 notebooks by choosing Scala 2.11 in the dropdown
in the upper right of the Jupyter Notebook start page.
Note: Since IPython has now been replaced by Jupyter, we replaced ipython in the above commands with jupyter.
I've just run:
conda create --name base2 --clone base to create an env just like base.
conda activate base2 to move to the new env.
conda install -c conda-forge spylon-kernel.
python -m spylon_kernel install --user. create a kernel spec for Jupyter notebook
jupyter-notebook
...and works just fine.
I'm using:
Anaconda 4.7.12
Jupyter-notebook 6.0.1
Ubuntu 18.04
ipykernel 5.1.3
ipython 7.9.0
ipython_genutils 0.2.0
jupyter_client 5.3.4
jupyter_core 4.6.0
traitlets 4.3.3
from def suma(a: Int) = a + 3
I can't add a comment to Heapify's answer, but his solution worked for JupyterLab on Windows without problems.
I cut and pasted his code into an Anaconda Powershell prompt
pip install spylon-kernel
python -m spylon_kernel install
jupyter notebook
And refreshed my anacopnda launcher and the spylon project option was available.
The answer for Linux can be found here.
Install Scala. Add these lines to ~/.bashrc
export SCALA_HOME=/usr/local/share/scala export
PATH=$PATH:$SCALA_HOME/bin:$PATH
Follow these instructions from the
GitHub site:
Download and unpack pre-packaged binaries Scala 2.11. Unpack each
downloaded archive(s), and, from a console, go to the bin
sub-directory of the directory it contains. Then run the following to
set-up the corresponding Scala kernel:
./jove-scala --kernel-spec
Make sure spark is installed in local along with SPARK_HOME is added or exported in .profile/environment file.
If not, you might get stuck with the following message:
"Intitializing Scala interpreter ..."
without any result.
For mac, I needed only to 3 commands to add Scala and run it with Spark (I had it already installed) on my Jupyter notebook
pip install spylon-kernel
python -m spylon_kernel install
ipython notebook
Once you run them on your terminal, you'll have spylon-kernel in your notebook, which can be used as your a Scala notebook.
spylon-kernel hasn't seen an update in years. These days its much better to use almond.

After installing Scala using MacPorts, scala command is not found

I am running Snow Leopard and installed MacPorts. I then installed the latest (as of this writing) Scala version as:
$ sudo port install scala29
What to do after this? When I try to execute the Scala interpreter, I get:
-bash: scala: command not found
I'm using MacPorts 2.1.2 and things seems to changed a bit again.
$ sudo port select --list scala
Shows
Available versions for scala:
none (active)
scala2.9
Command suggested by nezda does not work properly:
$ sudo port select --set scala2.9
gives error
Error: The 'set' command expects two arguments: <group>, <version>
But following helps
$ sudo port select --set scala scala2.9
Activates Scala 2.9
Selecting 'scala2.9' for 'scala' succeeded. 'scala2.9' is now active.
Checking scala again
$ sudo port select --list scala
Available versions for scala:
none
scala2.9 (active)
And I can run Scala now.
This seems to have changed. On Lion + MacPorts 2.1.1, I had to do the following:
Verify this shows the version:
sudo port select --list scala
Mine showed:
Available versions for scala:
none (active)
scala29
If it is not selected, you can use this command to select it:
sudo port select --set scala scala29
Open a new terminal (ensuring $PATH is up to date) and verify scala is now 2.9.x.
Okay, so I actually had to search this since the Scala install has changed since the last time I did it. The executables should have been linked from /opt/local/bin, to use them without prefixing the folders do this:
cd /opt/local/bin
sudo scala_select scala29
Now you should be able to run the scala command from any directory.
As of January 2013 this answer is outdated, Arnost Valicek's answer is known to work.
I think it's:
sudo port select --set scala scala29