What controls the paths in Jupyter? - jupyter

I have tried repeatedly to delete everything Jupyter related across my system. Nevertheless, even when freshly installing Jupyter in a virtualenv when I type jupyter --paths I get
config:
/Users/$user/.jupyter
/Users/$user/env/etc/jupyter
/usr/local/etc/jupyter
/etc/jupyter
data:
/Users/$user/Library/Jupyter
/Users/$user/env/share/jupyter
/usr/local/share/jupyter
/usr/share/jupyter
runtime:
/Users/$user/Library/Jupyter/runtime
Where is this and how do I reset or remove these paths? Where would this hidden config file be?

Related

Why Does the VS Code Jupyter Extension Keep Timing-out Trying to Find a Kernel That Exists?

I need to set up virtual environments for each language that I use. To do this, I'm running the Ubuntu 20.04 LTS Windows Subsystem for Linux (WSL) on Windows 10. Within WSL, I'm using Anaconda, installed in /usr/local/Anaconda, to create conda virtual environments for each language (i.e. one environment contains all my Python stuff, another contains my R stuff, etc.).
Since WSL doesn't come with a GUI, I'm using Visual Studio Code's (VSCode) Jupyter Notebook Extension to run Jupyter Notebooks to see plots/graphics. So far, I managed to easily create conda environments for Python (with ipython and ipykernel) and R (with IRkernel) and run their code in a notebook via the extension. Each time I set up an environment, the extension is easily able to find the kernel, connect to it and run the code.
However, I've not been able to set up an environment for Julia. I followed the documentation on the Julia website for installing the kernel, which is successfully found by the extension. But, when I try running a cell, the extension says it is trying to connect to the kernel, only for it to timeout and fail.
Here are the steps I have taken so far:
Create a clean conda environment (conda create -n Julia && conda activate Julia)
Install the latest version of Julia (conda install -c conda-forge julia)
Install the latest version of Jupyter (conda install -c conda-forge jupyter)
Install the Julia kernel with the built-in Julia package manager (using Pkg; Pkg.add("IJulia"))
Build the IJulia package (using Pkg; Pkg.build("IJulia"))
Confirm the presence of the Julia kernel (jupyter kernelspec list) which indeed shows the presence of a Julia kernel
Reload the VSCode connection to WSL (Ctrl + Shift + P; >Reload Window)
Shut down WSL via CMD (wsl --shutdown) for changes to take effect and reconnect
After I restart VSCode and WSL, the extension shows an option to use the Julia kernel installed in my conda environment: Julia 1.7.2 (~/.conda/envs/Julia/bin/julia). But when I create a cell and run code in a notebook, the extension creates a popup saying that it is connecting to the kernel and after some time an error message shows up:
Failed to start the Kernel.
Unable to start Kernel `Julia 1.7.2` due to connection timeout.
View Jupyter log for further details
I can also see the kernel spec JSON file in ~/.local/share/jupyter/kernels/julia-1.7/kernel.json
{
"display_name": "Julia 1.7.2",
"argv": [
"/home/USER/.conda/envs/Julia/bin/julia",
"-i",
"--color=yes",
"--project=#.",
"/home/USER/.conda/envs/Julia/share/julia/packages/IJulia/AQu2H/src/kernel.jl",
"{connection_file}"
],
"language": "julia",
"env": {},
"interrupt_mode": "signal"
}
The log file starts showing problems here:
info 17:50:48.378: Process Execution: cwd: ~
cwd: ~
warn 17:50:48.893: StdErr from Kernel Process [91m[1mERROR: [22m[39m
warn 17:50:49.138: StdErr from Kernel Process LoadError:
warn 17:50:49.795: StdErr from Kernel Process ArgumentError: Package IJulia not found in current path:
- Run `import Pkg; Pkg.add("IJulia")` to install the IJulia package.
The extension says it cannot find the IJulia kernel. This perplexes me because I can see the kernel spec in my home directory, the jupyter binary I installed from conda says that its there and the Jupyter Notebook extension can see the kernel. I have no explanation as to why the extension can see the kernel, match up the kernelspec but not be able to connect to it. Help would greatly be appreciated!

Jupyter Notebook PySpark Kernel referencing lowered pip version from host machine site-packages

I am using a Jupyter Notebook which is provided by an AWS managed service called EMR Studio. My understanding of how these notebooks work is that they are hosted on EC2 instances that I provision as part of my EMR cluster. Specifically with the PySpark kernel using the task nodes.
Currently when I run the command sc.list_packages() I see that pip is at version 9.0.1 whereas if I SSH onto the master node and run pip list I see that pip is at version 20.2.2. I have issues running the command sc.install_pypi_package() due to the lowered pip version in the Notebook.
In the notebook cell if I run import pip then pip I see that the module is located at
<module 'pip' from '/mnt1/yarn/usercache/<LIVY_IMPERSONATION_ROLE>/appcache/application_1652110228490_0001/container_1652110228490_0001_01_000001/tmp/1652113783466-0/lib/python3.7/site-packages/pip/__init__.py'>
I am assuming this is most likely within a virtualenv of some sort running as an application on the task node? I am unsure of this and I have no concrete evidence of how the virtualenv is provisioned if there is one.
If I run sc.uninstall_package('pip') then sc.list_packages() I see pip at a version of 20.2.2 which is what I am looking to initially start off with. The module path is the same as previously mentioned.
How can I get pip 20.2.2 in the virtualenv instead of pip 9.0.1?
If I import a package like numpy I see that the module is located at a different location from where pip is. Any reason for this?
<module 'numpy' from '/usr/local/lib64/python3.7/site-packages/numpy/__init__.py'>
As for pip 9.0.1 the only reference I can find at the moment is in /lib/python2.7/site-packages/virtualenv_support/pip-9.0.1-py2.py3-none-any.whl. One directory outside of this I see a file called virtualenv-15.1.0-py2.7.egg-info which if I cat the file states that it upgrades to pip 9.0.1. I have tried to remove the pip 9.0.1 wheel file and replaced it with a pip 20.2.2 wheel which caused issues with the PySpark kernel being able to provision properly. There is also a virtualenv.py file which does reference a __version__ = "15.1.0".
I was able to find a solution on updating pip, setuptools, and wheel in the virtualenv that PySpark uses.
I initially had to determine how pip 9 is being sourced. By SSH'ing to my EMR Master node I changed directories into the root cd / and then ran the command sudo find . -name "pip*" to recursively search for where pip files may be located at.
In my scenario there is a pip 9 wheel located at:
./usr/lib/python2.7/site-packages/virtualenv_support/pip-9.0.1-py2.py3-none-any.whl
By searching around a bit more in /usr/lib/python2.7/site-packages there is a virtualenv.py that is being invoked to create the virtualenv and is explained a bit more below.
Within the PySpark notebook session using %%info shows that the virtualenv is created from this file path (thanks Parag):
'spark.pyspark.virtualenv.bin.path': '/usr/bin/virtualenv'
Running cat /usr/bin/virtualenv shows that the virtualenv is being invoked from the following commands:
#!/usr/bin/python
import virtualenv
virtualenv.main()
This version of python in /usr/bin is python2.7. At the terminal I ran the following commands in sequence:
/usr/bin/python
import virtualenv
virtualenv
This outputs:
<module 'virtualenv' from '/usr/lib/python2.7/site-packages/virtualenv.py'>
I have sometimes seen a virtualenv.pyc file being used here which is located in /usr/lib/python2.7/site-packages/ but I have seen other users suggest that .pyc files can be deleted.
On the EMR master node I ran the command /usr/bin/virtualenv which shows some flags that can be used. First I used /usr/bin/virtualenv --verbose ./myVE which shows that pip 9.0.1 is packaged into the virtualenv I created. If I run /usr/bin/virtualenv --verbose --download ./myVE2 this shows output that an updated version of pip, setuptools, and wheel are being downloaded from Artifactory (our private PyPi mirror) into the virtualenv. There is a /etc/pip.conf that we use to setup the index-url and trusted host for Artifactory to be used instead of PyPi.
At this point it seems that the EMR cluster's virtualenv.py file as a default does not download updated wheels from Artifactory/PyPi and instead uses the wheel files located in /usr/lib/python2.7/site-packages/virtualenv_support/*.whl
Running cat /usr/lib/python2.7/site-packages/virtualenv.py shows that this version of virtualenv is 15.1.0 which is very outdated (2016 release).
Reading more into virtualenv.py shows that the main() function has a block of code as follows:
parser.add_option(
"--download",
dest="download",
action="store_true",
help="Download preinstalled packages from PyPI.",
)
I compared this virtualenv.py file on my EMR master to the official release of virtualenv==15.1.0 from PyPi (https://pypi.org/project/virtualenv/15.1.0/). I downloaded the tar.gz file and unzipped it on my local machine. There is a virtualenv.py file in the unzipped folder. When comparing contents using diff of the official virtualenv.py file to the EMR cluster's virtualenv.py file there are only a couple of lines that are not the same. The main difference is that parser.add_option from the code block above has default=True, in the official virtualenv.py file. The EMR cluster's virtualenv.py file does not have this.
parser.add_option(
"--download",
dest="download",
action="store_true",
default=True,
help="Download preinstalled packages from PyPI.",
)
What I did from here was I copied the EMR cluster's virtualenv.py and updated the line of code to set default=True,. I then used this updated virtualenv.py as part of an EMR bootstrap script so that this file is updated on all node types (master/core/task).
The bootstrap script does the following:
sudo rm /usr/lib/python2.7/site-packages/virtualenv.pyc
sudo rm /usr/lib/python2.7/site-packages/virtualenv.py
sudo aws s3 cp <UPDATED_VIRTUALENV_S3_PATH> /usr/lib/python2.7/site-packages/
Ensure that the copied file from S3 is just called virtualenv.py in the event that this causes any issues due to filenames not being kept the same.
Now when I start up a PySpark kernel the spark.pyspark.virtualenv.bin.path invokes the updated virtualenv.py file and I am able to confirm that pip is at a much higher version number (20+) which is what I was looking to achieve.

Conda virtual environment for IPython

I'm fairly new to the Python scene. My problem is that when I launch a jupyter notebook from an Anaconda Powershell with my DataScience virtual environment activated, the notebook does not have my virtual environment in it's PATH, and therefore cannot find some packages (like plotly and progress). The same is true when I launch VS Code from Anaconda Navigator with DataScience activated. When I run import plotly in an interactive window, I get ModuleNotFoundError: No module named ‘plotly’. But when I run this line in the terminal within VS Code, it runs without error.
So I have run the following commands in various shell/terminal sessions:
import sys
print(sys.path)
In a VS Code terminal I get:
['', 'C:\\Users\\adiad\\Anaconda3\\envs\\DataScience\\python37.zip', 'C:\\Users\\adiad\\Anaconda3\\envs\\DataScience\\DLLs', 'C:\\Users\\adiad\\Anaconda3\\envs\\DataScience\\lib', 'C:\\Users\\adiad\\Anaconda3\\envs\\DataScience', 'C:\\Users\\adiad\\Anaconda3\\envs\\DataScience\\lib\\site-packages']
In an interactive window in VS Code I get:
['C:\\Users\\adiad\\AppData\\Local\\Temp\\04e2b30c-4fc3-4aa9-9567-3aba17081a73', 'C:\\Users\\adiad\\Anaconda3\\python37.zip', 'C:\\Users\\adiad\\Anaconda3\\DLLs', 'C:\\Users\\adiad\\Anaconda3\\lib', 'C:\\Users\\adiad\\Anaconda3', '', 'C:\\Users\\adiad\\Anaconda3\\lib\\site-packages', 'C:\\Users\\adiad\\Anaconda3\\lib\\site-packages\\win32', 'C:\\Users\\adiad\\Anaconda3\\lib\\site-packages\\win32\\lib', 'C:\\Users\\adiad\\Anaconda3\\lib\\site-packages\\Pythonwin', 'C:\\Users\\adiad\\Anaconda3\\lib\\site-packages\\IPython\\extensions', 'C:\\Users\\adiad\\.ipython']
In a jupyter notebook running in my browser I get:
['C:\\Users\\adiad\\Anaconda3\\envs\\test', 'C:\\Users\\adiad\\Anaconda3\\python37.zip', 'C:\\Users\\adiad\\Anaconda3\\DLLs', 'C:\\Users\\adiad\\Anaconda3\\lib', 'C:\\Users\\adiad\\Anaconda3', '', 'C:\\Users\\adiad\\Anaconda3\\lib\\site-packages', 'C:\\Users\\adiad\\Anaconda3\\lib\\site-packages\\win32', 'C:\\Users\\adiad\\Anaconda3\\lib\\site-packages\\win32\\lib', 'C:\\Users\\adiad\\Anaconda3\\lib\\site-packages\\Pythonwin', 'C:\\Users\\adiad\\Anaconda3\\lib\\site-packages\\IPython\\extensions', 'C:\\Users\\adiad\\.ipython']
The IPython session don't appear to reference my virtual environment. So my question is: what do I need to do make IPython run with same environment as my terminal?
I found the following SO question which seems to answer my question, but I find it hard to believe that everyone is following this practice.
How to start an ipython shell(not notebook) within a conda or virtualenv
Here's my configuration:
conda version : 4.7.12
conda-build version : 3.18.8
python version : 3.7.3.final.0
virtual packages :
base environment : C:\Users\adiad\Anaconda3 (writable)
channel URLs : https://conda.anaconda.org/conda-forge/win-64
https://conda.anaconda.org/conda-forge/noarch
https://repo.anaconda.com/pkgs/main/win-64
https://repo.anaconda.com/pkgs/main/noarch
https://repo.anaconda.com/pkgs/r/win-64
https://repo.anaconda.com/pkgs/r/noarch
https://repo.anaconda.com/pkgs/msys2/win-64
https://repo.anaconda.com/pkgs/msys2/noarch
package cache : C:\Users\adiad\Anaconda3\pkgs
C:\Users\adiad\.conda\pkgs
C:\Users\adiad\AppData\Local\conda\conda\pkgs
envs directories : C:\Users\adiad\Anaconda3\envs
C:\Users\adiad\.conda\envs
C:\Users\adiad\AppData\Local\conda\conda\envs
platform : win-64
user-agent : conda/4.7.12 requests/2.22.0 CPython/3.7.3 Windows/10 Windows/10.0.18362
After doing further digging, my problem ought to be filed under, "knowing enough to be dangerous." My problem was ultimately caused by the fact that the jupyter package hadn't yet been installed in my new environment. So whenever I attempted to launch an IPython session of some kind, either in VS Code or in a browser, the application would look in my environment and see that the IPython packages weren't installed. It would then look to other conda environments and use the "nearest" equivalent, which was the base environment. Hence, most of the packages would load, but not all.
The fix to my problem was:
conda install jupyter
Another simple fix:
Launch CMD.exe prompt on Anaconda Navigator
Install: conda install jupyter
And
Conda install plotly

NBExtension does not validate in Jupyter Notebook

I am running Jupyter 4.1.0 on a Mac which I downloaded with Anaconda.
I have tried downloading front-end extensions to Jupyter Notebook but cannot get the packages to validate. For example after I downloaded calico-document-tools with:
jupyter nbextension install https://github.com/Calysto/notebook-extensions/archive/master.zip
When I try to validate it, I get the following.
$ jupyter nbextension enable calico-document-tools
Unrecognized JSON config file version, assuming version 1
Enabling notebook extension calico-document-tools...
- Validating: problems found:
- require? X calico-document-tools
What does this error message mean?

How do I install Scala in Jupyter IPython Notebook?

Here's a few links that I went to and did exactly what they said. I don't know what I'm doing wrong.
https://github.com/alexarchambault/jupyter-scala
https://github.com/ipython/ipython/wiki/IPython-kernels-for-other-languages
https://github.com/apache/incubator-toree
http://jcrudy.github.io/blog/html/2013/12/08/introduction_to_iscala.html
None of this is working. It may be some way that my node is configured. I just don't know. Please help.
I tried the following with Jupyterhub notebook and it works seamlessly:
# Step 1: Install spylon kernel
pip install spylon-kernel
# Step 2: create a kernel spec
python -m spylon_kernel install
# Step 3: start jupyter notebook
jupyter notebook
PS: to list all installed kernels, you can run the following command:
jupyter kernelspec list
You can use the information given here.
Ensure you have IPython 3 installed. ipython --version should return a
value >= 3.0. If it's not the case, a quick way of setting it up
consists in installing the Anaconda Python distribution, and then
running
$ pip install --upgrade "ipython[all]"
ipython --version should then return a value >= 3.0.
Download the Jupyter Scala binaries for Scala 2.10 (txz or zip) or
Scala 2.11 (txz or zip), and unpack them in a safe place. Then run
once the jupyter-scala program (or jupyter-scala.bat on Windows) it
contains. That will set-up the Jupyter Scala kernel for the current
user.
Check that Jupyter/IPython knows about Jupyter Scala by running
$ jupyter kernelspec list
This should print, among others, a line like
scala211
(or scala210 dependending on the Scala version you chose).
Then run either IPython console with
$ ipython console --kernel scala211
and start using the Jupyter Scala kernel straightaway, or run Jupyter
Notebook with
$ jupyter notebook
and create Scala 2.11 notebooks by choosing Scala 2.11 in the dropdown
in the upper right of the Jupyter Notebook start page.
Note: Since IPython has now been replaced by Jupyter, we replaced ipython in the above commands with jupyter.
I've just run:
conda create --name base2 --clone base to create an env just like base.
conda activate base2 to move to the new env.
conda install -c conda-forge spylon-kernel.
python -m spylon_kernel install --user. create a kernel spec for Jupyter notebook
jupyter-notebook
...and works just fine.
I'm using:
Anaconda 4.7.12
Jupyter-notebook 6.0.1
Ubuntu 18.04
ipykernel 5.1.3
ipython 7.9.0
ipython_genutils 0.2.0
jupyter_client 5.3.4
jupyter_core 4.6.0
traitlets 4.3.3
from def suma(a: Int) = a + 3
I can't add a comment to Heapify's answer, but his solution worked for JupyterLab on Windows without problems.
I cut and pasted his code into an Anaconda Powershell prompt
pip install spylon-kernel
python -m spylon_kernel install
jupyter notebook
And refreshed my anacopnda launcher and the spylon project option was available.
The answer for Linux can be found here.
Install Scala. Add these lines to ~/.bashrc
export SCALA_HOME=/usr/local/share/scala export
PATH=$PATH:$SCALA_HOME/bin:$PATH
Follow these instructions from the
GitHub site:
Download and unpack pre-packaged binaries Scala 2.11. Unpack each
downloaded archive(s), and, from a console, go to the bin
sub-directory of the directory it contains. Then run the following to
set-up the corresponding Scala kernel:
./jove-scala --kernel-spec
Make sure spark is installed in local along with SPARK_HOME is added or exported in .profile/environment file.
If not, you might get stuck with the following message:
"Intitializing Scala interpreter ..."
without any result.
For mac, I needed only to 3 commands to add Scala and run it with Spark (I had it already installed) on my Jupyter notebook
pip install spylon-kernel
python -m spylon_kernel install
ipython notebook
Once you run them on your terminal, you'll have spylon-kernel in your notebook, which can be used as your a Scala notebook.
spylon-kernel hasn't seen an update in years. These days its much better to use almond.