Use notebook with slurm - visual-studio-code

I want to be able to connect to our institution's cluster using VS Code Remote SSH without the server running on a compute node instead of the login node. The preferred workflow is to SSH into the login node and then use a command to allocate a job and spin up an interactive shell on a compute node, and then run the jupyter notebook kernel on those node.
I launch jupyter-lab through this bash script
#!/bin/bash
#SBATCH -J jupyterTest
#SBATCH -N 1
#SBATCH --mem=16GB
#SBATCH --time=5:00:00
#SBATCH --output=/home/adufour/work/jupyter.log
#Load necessary modules
module purge
source /home/adufour/.bashrc
conda activate singlecell
#Go to the folder you wanna run jupyter in
cd ~/work
#Start the notebook
jupyter lab --ip=0.0.0.0 --port=8888
Which give me the following log :
[I 2021-02-16 16:14:29.363 ServerApp] jupyterlab | extension was successfully linked.
[I 2021-02-16 16:14:31.551 ServerApp] nbclassic | extension was successfully linked.
[I 2021-02-16 16:14:31.690 LabApp] JupyterLab extension loaded from /work/adufour/anaconda3/envs/singlecell/lib/python3.9/site-packages/jupyterlab
[I 2021-02-16 16:14:31.691 LabApp] JupyterLab application directory is /work/adufour/anaconda3/envs/singlecell/share/jupyter/lab
[I 2021-02-16 16:14:31.698 ServerApp] jupyterlab | extension was successfully loaded.
[I 2021-02-16 16:14:31.714 ServerApp] nbclassic | extension was successfully loaded.
[I 2021-02-16 16:14:31.719 ServerApp] Serving notebooks from local directory: /work/adufour
[I 2021-02-16 16:14:31.719 ServerApp] Jupyter Server 1.3.0 is running at:
[I 2021-02-16 16:14:31.719 ServerApp] http://node126:8888/lab?token=6a1178ff82901e42559423100b13ba12105935432fb0efdd
[I 2021-02-16 16:14:31.719 ServerApp] or http://127.0.0.1:8888/lab?token=6a1178ff82901e42559423100b13ba12105935432fb0efdd
[I 2021-02-16 16:14:31.719 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 2021-02-16 16:14:31.744 ServerApp]
To access the server, open this file in a browser:
file:///home/adufour/.local/share/jupyter/runtime/jpserver-108627-open.html
Or copy and paste one of these URLs:
http://node126:8888/lab?token=6a1178ff82901e42559423100b13ba12105935432fb0efdd
or http://127.0.0.1:8888/lab?token=6a1178ff82901e42559423100b13ba12105935432fb0efdd
Unable to connect to VS Code server.
Error in request

Related

Jupyter error when starting Kernel. Cell not executing just shows *

I use a custom kernel manager, which launches kernels as Docker containers.
Recently updated my system, but can't find any relevant Jupyter versions which may affect this:
Tornado 6.2
JupyterLab Version 3.2.9
Custom Kernels
When starting a new Kernel I get:
[I 2022-07-28 10:17:20.946 ServerApp] Kernel started: b2039b53-2ce0-48f1-b327-2b0393a53ff5
[W 2022-07-28 10:17:20.952 ServerApp] No session ID specified
NOTE: When using the `ipython kernel` entry point, Ctrl-C will not work.
To exit, you will have to explicitly quit this process, by either sending
"quit" from a client, or using Ctrl-\ in UNIX-like environments.
To read more about this, see https://github.com/ipython/ipython/issues/2049
To connect another client to this kernel, use:
--existing /connection-spec
[W 2022-07-28 10:18:20.954 ServerApp] Timeout waiting for kernel_info reply from b2039b53-2ce0-48f1-b327-2b0393a53ff5
[W 2022-07-28 10:18:25.473 ServerApp] Nudge: attempt 10 on kernel b2039b53-2ce0-48f1-b327-2b0393a53ff5
[W 2022-07-28 10:18:30.486 ServerApp] Nudge: attempt 20 on kernel b2039b53-2ce0-48f1-b327-2b0393a53ff5
[W 2022-07-28 10:18:35.499 ServerApp] Nudge: attempt 30 on kernel b2039b53-2ce0-48f1-b327-2b0393a53ff5
[W 2022-07-28 10:18:40.517 ServerApp] Nudge: attempt 40 on kernel b2039b53-2ce0-48f1-b327-2b0393a53ff5
[W 2022-07-28 10:18:45.533 ServerApp] Nudge: attempt 50 on kernel b2039b53-2ce0-48f1-b327-2b0393a53ff5
[W 2022-07-28 10:18:50.548 ServerApp] Nudge: attempt 60 on kernel b2039b53-2ce0-48f1-b327-2b0393a53ff5
[W 2022-07-28 10:18:55.562 ServerApp] Nudge: attempt 70 on kernel b2039b53-2ce0-48f1-b327-2b0393a53ff5
[W 2022-07-28 10:19:00.576 ServerApp] Nudge: attempt 80 on kernel b2039b53-2ce0-48f1-b327-2b0393a53ff5
[W 2022-07-28 10:19:05.592 ServerApp] Nudge: attempt 90 on kernel b2039b53-2ce0-48f1-b327-2b0393a53ff5
[W 2022-07-28 10:19:10.607 ServerApp] Nudge: attempt 100 on kernel b2039b53-2ce0-48f1-b327-2b0393a53ff5
[W 2022-07-28 10:19:15.624 ServerApp] Nudge: attempt 110 on kernel b2039b53-2ce0-48f1-b327-2b0393a53ff5
[W 2022-07-28 10:19:20.637 ServerApp] Nudge: attempt 120 on kernel b2039b53-2ce0-48f1-b327-2b0393a53ff5
[E 2022-07-28 10:19:20.960 ServerApp] Uncaught exception GET /api/kernels/b2039b53-2ce0-48f1-b327-2b0393a53ff5/channels (127.0.0.1)
HTTPServerRequest(protocol='http', host='127.0.0.1:8080', method='GET', uri='/api/kernels/b2039b53-2ce0-48f1-b327-2b0393a53ff5/channels', version='HTTP/1.1', remote_ip='127.0.0.1')
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/tornado/websocket.py", line 944, in _accept_connection
await open_result
File "/opt/conda/lib/python3.7/asyncio/tasks.py", line 318, in __wakeup
future.result()
concurrent.futures._base.TimeoutError: Timeout
Working case I see, Websocket is opening successfully and:
[I 2022-07-28 10:23:49.698 ServerApp] Kernel started: 84d2de07-7ed5-46ba-8f85-c2bb583be71d
[W 2022-07-28 10:23:49.704 ServerApp] No session ID specified
NOTE: When using the `ipython kernel` entry point, Ctrl-C will not work.
To exit, you will have to explicitly quit this process, by either sending
"quit" from a client, or using Ctrl-\ in UNIX-like environments.
To read more about this, see https://github.com/ipython/ipython/issues/2049
To connect another client to this kernel, use:
--existing /connection-spec
[W 2022-07-28 10:23:53.058 ServerApp] No channel specified, assuming shell: {'header': {'msg_id': '6101c9180e5f11ed969a42010a800005', 'username': 'test', 'session': '6101cc4c0e5f11ed969a42010a800005', 'data': '2022-07-28T10:23:53.054348', 'msg_type': 'execute_request', 'version': '5.0'}, 'parent_header': {'msg_id': '6101c9180e5f11ed969a42010a800005', 'username': 'test', 'session': '6101cc4c0e5f11ed969a42010a800005', 'data': '2022-07-28T10:23:53.054348', 'msg_type': 'execute_request', 'version': '5.0'}, 'metadata': {}, 'content': {'code': 'print(123)', 'silent': False}}
any suggestions?

Opening Jupyterhub url on kubernetes is so slow(gets js, icon)

I deployed jupyterhub on kubernetes using helm.
and I can login with ID 'admin'
but when I first login, the url doesn't respond or respond after 30~50 seconds later, it seems it fails to get the javascript file or icon.
When I refresh it, it works then.
Is there any problem with the network in my kubernetes cluster?
I'm using GlusterFS Storage Class for Dynamic Provisioning.
This is my config file when install jupyterhub using helm.
proxy:
secretToken: "34999170ac41826f956ee1a757b53ff91ce6efabc3dfe24fcee863955efcc6b9"
The pod's log is like this(with user qqqqq)
[I 2020-12-23 05:22:21.664 SingleUserNotebookApp extension:158] JupyterLab extension loaded from /opt/conda/lib/python3.7/site-packages/jupyterlab
[I 2020-12-23 05:22:21.665 SingleUserNotebookApp extension:159] JupyterLab application directory is /opt/conda/share/jupyter/lab
[I 2020-12-23 05:22:22.015 SingleUserNotebookApp singleuser:561] Starting jupyterhub-singleuser server version 1.1.0
[I 2020-12-23 05:22:22.022 SingleUserNotebookApp notebookapp:1924] Serving notebooks from local directory: /home/jovyan
[I 2020-12-23 05:22:22.022 SingleUserNotebookApp notebookapp:1924] The Jupyter Notebook is running at:
[I 2020-12-23 05:22:22.022 SingleUserNotebookApp notebookapp:1924] http://jupyter-qqqqq:8888/user/qqqqq/
[I 2020-12-23 05:22:22.022 SingleUserNotebookApp notebookapp:1925] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[I 2020-12-23 05:22:22.038 SingleUserNotebookApp singleuser:542] Updating Hub with activity every 300 seconds
[I 2020-12-23 05:22:25.096 SingleUserNotebookApp log:174] 302 GET /user/qqqqq/ -> /user/qqqqq/tree? (#10.233.79.154) 0.93ms
[I 2020-12-23 05:22:25.165 SingleUserNotebookApp log:174] 302 GET /user/qqqqq/ -> /user/qqqqq/tree? (#10.233.93.0) 0.76ms
[I 2020-12-23 05:22:25.185 SingleUserNotebookApp log:174] 302 GET /user/qqqqq/tree? -> /hub/api/oauth2/authorize?client_id=jupyterhub-user-qqqqq&redirect_uri=%2Fuser%2Fqqqqq%2Foauth_callback&response_type=code&state=[secret] (#10.233.93.0) 2.31ms
[I 2020-12-23 05:22:25.561 SingleUserNotebookApp auth:981] Logged-in user {'kind': 'user', 'name': 'qqqqq', 'admin': False, 'groups': [], 'server': '/user/qqqqq/', 'pending': None, 'created': '2020-12-23T05:22:16.257525Z', 'last_activity': '2020-12-23T05:22:25.524384Z', 'servers': None}
[I 2020-12-23 05:22:25.562 SingleUserNotebookApp log:174] 302 GET /user/qqqqq/oauth_callback?code=[secret]&state=[secret] -> /user/qqqqq/tree? (#10.233.93.0) 250.52ms
[I 2020-12-23 05:22:25.654 SingleUserNotebookApp log:174] 200 GET /user/qqqqq/tree? (qqqqq#10.233.93.0) 71.92ms
GET //usr/qqqqq/tree? I'm getting stuck in here.
Thanks for any advice!
try looking for events of hub pod and also for user pods, it must be either taking time in allocating pvc, pulling image (for first time only if not present in machine) or setting up route in jhub. Also check whether you've added any postHook in user images config.
I don't know why but, after I changed kubernetes version from 1.16 to 1.17, it works fine.

Pyspark kernel doesnt start on Jupyterhub

The Jupyterhub Pyspark kernel used to work very well but now will not start or will have Kernel connected status (doesnt go to idle) but will not run any code in cells. It is using localprocess spawner with PAM auth on Centos.In the jupyterhub logs we see these messages:
[I 2020-03-10 15:13:08.644 SingleUserNotebookApp restarter:110] KernelRestarter: restarting kernel (4/5), new random ports
Assertion failed: rc == 0 (src/socket_poller.cpp:41)
[W 2020-03-10 15:13:11.666 SingleUserNotebookApp restarter:100] KernelRestarter: restart failed
[W 2020-03-10 15:13:11.666 SingleUserNotebookApp kernelmanager:127] Kernel 0148c77d-143a-4721-90e7-0d9a41a878c4 died, removing from map.
Any thoughts?
This was resolved by reinstalling sparkmagic in the base conda environment:
(base) [root#]# pip install sparkmagic

Script hangs only when started via SystemD

I am trying to start the Kafka Connect component alongside Cloudera and want to wrap the Kafka Connect startup script in a systemd service file such that it can start on boot once the Cloudera services start.
For some strange reason if I start this script outside of systemd it works just fine, but when I start it via systemctl start kafka-connect it just hangs at the following log entry lines.
[Unit]
Description=Kafka Connect - Distributed
After=network.target cloudera-scm-agent
[Service]
Type=forking
ExecStart=/bin/bash -c "/app/cloudera/parcels/KAFKA/lib/kafka/bin/connect-distributed.sh -daemon /app/cloudera/kafka-connect/connect-distributed.properties"
SuccessExitStatus=143
[Install]
WantedBy=multi-user.target
2019-07-18 13:40:27,962 INFO (main) [ConnectDistributed(main:69)] Scanning for plugin classes. This might take a moment ...
2019-07-18 13:40:27,992 INFO (main) [DelegatingClassLoader(registerPlugin:184)] Loading plugin from: /app/cloudera/kafka-connect/plugins/splunk-kafka-connect.jar
2019-07-18 13:40:27,993 DEBUG (main) [DelegatingClassLoader(registerPlugin:191)] Loading plugin urls: [file:/app/cloudera/kafka-connect/plugins/splunk-kafka-connect.jar]
2019-07-18 13:40:29,206 DEBUG (main) [VersionUtils(getVersionFromProperties:63)] found git version string=v1.1.0 in version.properties file
2019-07-18 13:40:29,219 INFO (main) [DelegatingClassLoader(scanUrlsAndAddPlugins:207)] Registered loader: PluginClassLoader{pluginLocation=file:/app/cloudera/kafka-connect/plugins/splunk-kafka-connect.jar}
2019-07-18 13:40:29,220 INFO (main) [DelegatingClassLoader(addPlugins:136)] Added plugin 'com.splunk.kafka.connect.SplunkSinkConnector'
2019-07-18 13:40:29,220 INFO (main) [DelegatingClassLoader(addPlugins:136)] Added plugin 'org.apache.kafka.connect.storage.StringConverter'
My next thought was to try something a bit simpler and just create an init.d script at /etc/init.d/kafka-connect which works fine because it is nothing more than a shell script wrapper UNTIL I source in the /etc/init.d/functions file which causes this to be started via systemd again.
Two main questions -
What does systemctl do differently than a regular shell script that would cause this shell script (which in turn launches a java process) to hang at that exact same step everytime.
If my /etc/init.d/kafka-connect script works, is it okay to use on RHEL7? If so, is there a way to load that init.d script on boot of the server?
Thanks in advance!

SciRuby not plotting the bars

I don't see output for my SciRuby code. I expected to see output
but I am getting output like the one in below screenshot
Code:
require 'nyaplot'
include Nyaplot
plot = Plot.new
bar = plot.add(:bar, [:a, :b, :c], [3,4,5])
plot
Logs:
➜ Ruby iruby notebook
[TerminalIPythonApp] WARNING | Subcommand `ipython notebook` is deprecated and will be removed in future versions.
[TerminalIPythonApp] WARNING | You likely want to use `jupyter notebook`... continue in 5 sec. Press Ctrl-C to quit now.
[I 17:41:40.578 NotebookApp] Serving notebooks from local directory: /home/abhimanyuaryan/Public/RoR/Ruby
[I 17:41:40.578 NotebookApp] 0 active kernels
[I 17:41:40.578 NotebookApp] The Jupyter Notebook is running at: http://localhost:8888/
[I 17:41:40.578 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
Gtk-Message: Failed to load module "pantheon-filechooser-module"
(firefox:19417): Gtk-WARNING **: Unable to locate theme engine in module_path: "pixmap",
(firefox:19417): Gtk-WARNING **: Unable to locate theme engine in module_path: "pixmap",
(firefox:19417): Gtk-WARNING **: Unable to locate theme engine in module_path: "pixmap",
(firefox:19417): Gtk-WARNING **: Unable to locate theme engine in module_path: "pixmap",
(firefox:19417): Gtk-WARNING **: Unable to locate theme engine in module_path: "pixmap",
Gtk-Message: Failed to load module "canberra-gtk-module"
[I 17:41:55.360 NotebookApp] 302 GET / (127.0.0.1) 0.57ms
[I 17:42:00.302 NotebookApp] Creating new notebook in
[I 17:42:02.225 NotebookApp] Kernel started: b4ef1b18-a8ac-40ea-a833-fb520b24d52f
[I 17:44:02.269 NotebookApp] Saving file at /Untitled1.ipynb
[I 17:46:02.257 NotebookApp] Saving file at /Untitled1.ipynb
[I 17:48:02.257 NotebookApp] Saving file at /Untitled1.ipynb
$ gem install nyaplot
$ gem install gnuplotrb
fixed the issue for me. Thanks to John Wood for the reply