spark-notebook: command not found - scala

I want to set up spark notebook on my laptop following the instructions listed in http://spark-notebook.io I gave the command bin/spark-notebook and I'm getting:
-bash: bin/spark-notebook: command not found
How to resolve this? I want to run spark-notebook for spark standalone and scala.

You can download
spark-notebook-0.7.0-pre2-scala-2.10.5-spark-1.6.3-hadoop-2.7.2-with-parquet.tqz
Set the path in bashrc
Example :
$sudo gedit ~/.bashrc
export SPARK_HOME=/yor/path/
export PATH=$PATH:$SPARK+HOME/bin
Then start your notebook following command...
$spark-notebook

Related

Bun Not Found After Running Installation Script

I have ran the installation script by pasting this code:
$ curl https://bun.sh/install | bash
However, when I try to get the version of bun, it says it could not find it:
$ bun --version
Command 'bun' not found, did you mean:
command 'ben' from deb ben (0.9.0ubuntu2)
command 'bus' from deb atm-tools (1:2.5.1-4)
command 'zun' from deb python3-zunclient (4.0.0-0ubuntu1)
Try: sudo apt install <deb name>
I had the same issue running on Windows 10 WSL2 Ubuntu-22.04 with Bun v0.1.5.
The solution (and more detail just in case anyone needs it) below:
The executable for bun is in the directory "/home/username/.bun". You need to add this to your $PATH so that this can be found when typing bun commands such as "bun --help".
The bun setup process does not add this path so you need to do it yourself.
Two ways to do this :
(1) Manual method
Type in the terminal:
export BUN_INSTALL="/home/YOUR_USERNAME/.bun"
export PATH="$BUN_INSTALL/bin:$PATH"
Replacing YOUR_USERNAME with your real username (the current username can be found by typing 'whoami' in the terminal).
Note: This process will have to be REPEATED for every new shell you open.
(2) Automatic method
Edit the .bashrc file :
nano ~/.bashrc
at the end of this file add
BUN_INSTALL="/home/YOUR_USERNAME/.bun"
PATH="$BUN_INSTALL/bin:$PATH"
Replacing YOUR_USERNAME with your real username (the current username can be found by typing 'whoami' in the terminal).
(Remember to save your changes with Ctrl-O)
Note: You will NEED TO OPEN A NEW SHELL for this to work OR type 'source ~/.bashrc' to use in the current terminal.
You should now be able to run bun commands in any new shell.
The installation script says a message at the end telling you how to add bun to your PATH manually. Here is that output:
Manually add the directory to your $HOME/.bashrc (or similar)
BUN_INSTALL="/home/sno2/.bun"
PATH="$BUN_INSTALL/bin:$PATH"
I advise you re-run the installation command and copy the environment variables and add them to your PATH.
export BUN_INSTALL="/Users/manendra/.bun"
export PATH="$BUN_INSTALL/bin:$PATH"
add these to your .bashrc, .zshrc or you can use export command to use for current session.
Note: Change your username place of (manendra) "/Users/manendra/.bun"
Manually add the directory to ~/.bashrc (or similar):
export BUN_INSTALL="$HOME/.bun"
export PATH="$BUN_INSTALL/bin:$PATH"
From the installer, last message is:
To get started, run
exec /bin/zsh
bun --help

set PYSPARK_SUBMIT_ARGS="--name" "PySparkShell" "pyspark-shell" && jupyter notebook

I'm looking to install PySpark on my Windows 10 machine and have been unable to correctly specify the PYSPARK_SUBMIT_ARGS argument.
This is the error I'm seeing when I run the "pyspark" command from gitbash:
$ pyspark
set PYSPARK_SUBMIT_ARGS="--name" "PySparkShell" "pyspark-shell" && jupyter notebook
I've uninstalled all versions of Java, except version 8. Within my .bashrc file, my path is currently specified as:
export JAVA_HOME="C:\PROGRA~2\Java\jre1.8.0_261"
export PYSPARK_SUBMIT_ARGS="--master local[*] pyspark-shell"
export PYSPARK_PYTHON=python3
export PYSPARK_DRIVER_PYTHON_OPTS="notebook"
export PYSPARK_DRIVER_PYTHON="jupyter"
export SPARK_HOME="C:/spark/spark-2.4.7-bin-hadoop2.7"
export PATH=$SPARK_HOME/bin:$PATH
And JAVA_HOME is specified within my env variables and set in Path as well.
I would really appreciate any additional troubleshooting techniques!
Thank you so much!!!
Try to run from windows command prompt , having issues with gitbash

Running ./pyspark fails to find local directories

After installing Spark I am trying to run PySpark from the installation folder:
opt/spark/bin/pyspark
But I get the following errors:
opt/spark/bin/pyspark: line 24: /opt/spark/bin/load-spark-env.sh: No such file or directory
opt/spark/bin/pyspark: line 68: /opt/spark/bin/spark-submit: No such file or directory
opt/spark/bin/pyspark: line 68: exec: /opt/spark/bin/spark-submit: cannot execute: No such file or directory
Why is this happening when I can see these items in their respective directories? I'm also trying to get PySpark to run standalone as a command, but I'd imagine that I must solve the former problem first.
I am running this on macOS.
This error indicates that SPARK_HOME is not set. Try this:
export SPARK_HOME=/opt/spark
pyspark
FYI, it is strongly recommended to install software on mac OS using a package manager, like https://brew.sh
This is the configuration:
export SPARK_HOME=<YOUR-PATH>/spark-2.4.4-bin-hadoop2.7
export PYTHONPATH=$SPARK_HOME/python:$PYTHONPATH
export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.10.7-src.zip:$PYTHONPATH
And if you are thinking to use notebook as well:
export PYSPARK_DRIVER_PYTHON="jupyter"
export PYSPARK_DRIVER_PYTHON_OPTS="notebook"
export PYSPARK_PYTHON=python3
export PATH=$SPARK_HOME:$PATH:~/.local/bin:$JAVA_HOME/bin:$JAVA_HOME/jre/bin

Executing terminal commands in Jupyter notebook

I am trying to run the following in Jupyter notebook (with Python 2 if it makes a difference):
!head xyz.txt
and I get the following error:
'head' is not recognized as an internal or external command, operable
program or batch file.
Is there anything I need to import to be able to do this?
Might be useful for others.
Use ! followed by terminal command you want to execute. To run a shell command. E.g.,
! pip install some_package
to install the some_package.
An easier way to invoke terminal using jupyter-notebooks is to use magic function %%bash and use the jupyter cell as a terminal:
%%bash
head xyz.txt
pip install keras
git add model.h5.dvc data.dvc metrics.json
git commit -m "Second model, trained with 2000 images"
For Windows it would be %%cmd.
Write it at the beginning of the cell like this :
%%cmd
where python
myprogram "blabla" -x -y -z
You can start the cell with the magic % bash before the rest of your code. There is an example in this blog post, together with a list of some of the most useful magics.
Make sure you run your command in linux shell because there is non such command in windows.
Another option nowadays is the Jupyter kernel for Bash.
I had the same issue. Solved by running
!bash -c "head xyz.txt"

mahout: command not found

I am following Mahout cookbook and while running this query :
mahout seqdirectory -i $WORK_DIR/original -o $WORK_DIR/sequencesfiles
I am getting an error
mahout: command not found.
I am a beginner in mahout .Kindly help in solving this error.
The reason for this is that the Mahout executable is not in your PATH. I find the easiest way to solve this to be the following:
set the MAHOUT_HOME environment variable. E.g: from the terminal, type export MAHOUT_HOME=<your Mahout home dir>. This is required in other use cases and it's generally a good idea to set it.
Now you can execute the command as $MAHOUT_HOME/bin/mahout ..., but we can do better.
Add the Mahout executable's dir to your PATH, e.g: export PATH=$PATH:$MAHOUT_HOME/bin. Now the mahout executable will be available from the command line as simply mahout.
And lastly, to avoid having to remember to do this at the beginning of every terminal session, update your .bashrc file to load these settings automatically. E.g: vi ~/.bashrc to enter vi and start editing your .bashrc file.
type "i" to enter "insert mode"
add the export MAHOUT_HOME=<your Mahout home dir> line
add the export PATH=$PATH:$MAHOUT_HOME/bin line
hit the Esc key to exit "insert mode"
type "ZZ" to save and quit vi
Your mahout is not found try running mahout from bin directory using following syntax
./mahout seqdirectory -i $WORK_DIR/original -o $WORK_DIR/sequencesfiles