Opening Jupyterhub url on kubernetes is so slow(gets js, icon) - kubernetes

I deployed jupyterhub on kubernetes using helm.
and I can login with ID 'admin'
but when I first login, the url doesn't respond or respond after 30~50 seconds later, it seems it fails to get the javascript file or icon.
When I refresh it, it works then.
Is there any problem with the network in my kubernetes cluster?
I'm using GlusterFS Storage Class for Dynamic Provisioning.
This is my config file when install jupyterhub using helm.
proxy:
secretToken: "34999170ac41826f956ee1a757b53ff91ce6efabc3dfe24fcee863955efcc6b9"
The pod's log is like this(with user qqqqq)
[I 2020-12-23 05:22:21.664 SingleUserNotebookApp extension:158] JupyterLab extension loaded from /opt/conda/lib/python3.7/site-packages/jupyterlab
[I 2020-12-23 05:22:21.665 SingleUserNotebookApp extension:159] JupyterLab application directory is /opt/conda/share/jupyter/lab
[I 2020-12-23 05:22:22.015 SingleUserNotebookApp singleuser:561] Starting jupyterhub-singleuser server version 1.1.0
[I 2020-12-23 05:22:22.022 SingleUserNotebookApp notebookapp:1924] Serving notebooks from local directory: /home/jovyan
[I 2020-12-23 05:22:22.022 SingleUserNotebookApp notebookapp:1924] The Jupyter Notebook is running at:
[I 2020-12-23 05:22:22.022 SingleUserNotebookApp notebookapp:1924] http://jupyter-qqqqq:8888/user/qqqqq/
[I 2020-12-23 05:22:22.022 SingleUserNotebookApp notebookapp:1925] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[I 2020-12-23 05:22:22.038 SingleUserNotebookApp singleuser:542] Updating Hub with activity every 300 seconds
[I 2020-12-23 05:22:25.096 SingleUserNotebookApp log:174] 302 GET /user/qqqqq/ -> /user/qqqqq/tree? (#10.233.79.154) 0.93ms
[I 2020-12-23 05:22:25.165 SingleUserNotebookApp log:174] 302 GET /user/qqqqq/ -> /user/qqqqq/tree? (#10.233.93.0) 0.76ms
[I 2020-12-23 05:22:25.185 SingleUserNotebookApp log:174] 302 GET /user/qqqqq/tree? -> /hub/api/oauth2/authorize?client_id=jupyterhub-user-qqqqq&redirect_uri=%2Fuser%2Fqqqqq%2Foauth_callback&response_type=code&state=[secret] (#10.233.93.0) 2.31ms
[I 2020-12-23 05:22:25.561 SingleUserNotebookApp auth:981] Logged-in user {'kind': 'user', 'name': 'qqqqq', 'admin': False, 'groups': [], 'server': '/user/qqqqq/', 'pending': None, 'created': '2020-12-23T05:22:16.257525Z', 'last_activity': '2020-12-23T05:22:25.524384Z', 'servers': None}
[I 2020-12-23 05:22:25.562 SingleUserNotebookApp log:174] 302 GET /user/qqqqq/oauth_callback?code=[secret]&state=[secret] -> /user/qqqqq/tree? (#10.233.93.0) 250.52ms
[I 2020-12-23 05:22:25.654 SingleUserNotebookApp log:174] 200 GET /user/qqqqq/tree? (qqqqq#10.233.93.0) 71.92ms
GET //usr/qqqqq/tree? I'm getting stuck in here.
Thanks for any advice!

try looking for events of hub pod and also for user pods, it must be either taking time in allocating pvc, pulling image (for first time only if not present in machine) or setting up route in jhub. Also check whether you've added any postHook in user images config.

I don't know why but, after I changed kubernetes version from 1.16 to 1.17, it works fine.

Related

Jupyter error when starting Kernel. Cell not executing just shows *

I use a custom kernel manager, which launches kernels as Docker containers.
Recently updated my system, but can't find any relevant Jupyter versions which may affect this:
Tornado 6.2
JupyterLab Version 3.2.9
Custom Kernels
When starting a new Kernel I get:
[I 2022-07-28 10:17:20.946 ServerApp] Kernel started: b2039b53-2ce0-48f1-b327-2b0393a53ff5
[W 2022-07-28 10:17:20.952 ServerApp] No session ID specified
NOTE: When using the `ipython kernel` entry point, Ctrl-C will not work.
To exit, you will have to explicitly quit this process, by either sending
"quit" from a client, or using Ctrl-\ in UNIX-like environments.
To read more about this, see https://github.com/ipython/ipython/issues/2049
To connect another client to this kernel, use:
--existing /connection-spec
[W 2022-07-28 10:18:20.954 ServerApp] Timeout waiting for kernel_info reply from b2039b53-2ce0-48f1-b327-2b0393a53ff5
[W 2022-07-28 10:18:25.473 ServerApp] Nudge: attempt 10 on kernel b2039b53-2ce0-48f1-b327-2b0393a53ff5
[W 2022-07-28 10:18:30.486 ServerApp] Nudge: attempt 20 on kernel b2039b53-2ce0-48f1-b327-2b0393a53ff5
[W 2022-07-28 10:18:35.499 ServerApp] Nudge: attempt 30 on kernel b2039b53-2ce0-48f1-b327-2b0393a53ff5
[W 2022-07-28 10:18:40.517 ServerApp] Nudge: attempt 40 on kernel b2039b53-2ce0-48f1-b327-2b0393a53ff5
[W 2022-07-28 10:18:45.533 ServerApp] Nudge: attempt 50 on kernel b2039b53-2ce0-48f1-b327-2b0393a53ff5
[W 2022-07-28 10:18:50.548 ServerApp] Nudge: attempt 60 on kernel b2039b53-2ce0-48f1-b327-2b0393a53ff5
[W 2022-07-28 10:18:55.562 ServerApp] Nudge: attempt 70 on kernel b2039b53-2ce0-48f1-b327-2b0393a53ff5
[W 2022-07-28 10:19:00.576 ServerApp] Nudge: attempt 80 on kernel b2039b53-2ce0-48f1-b327-2b0393a53ff5
[W 2022-07-28 10:19:05.592 ServerApp] Nudge: attempt 90 on kernel b2039b53-2ce0-48f1-b327-2b0393a53ff5
[W 2022-07-28 10:19:10.607 ServerApp] Nudge: attempt 100 on kernel b2039b53-2ce0-48f1-b327-2b0393a53ff5
[W 2022-07-28 10:19:15.624 ServerApp] Nudge: attempt 110 on kernel b2039b53-2ce0-48f1-b327-2b0393a53ff5
[W 2022-07-28 10:19:20.637 ServerApp] Nudge: attempt 120 on kernel b2039b53-2ce0-48f1-b327-2b0393a53ff5
[E 2022-07-28 10:19:20.960 ServerApp] Uncaught exception GET /api/kernels/b2039b53-2ce0-48f1-b327-2b0393a53ff5/channels (127.0.0.1)
HTTPServerRequest(protocol='http', host='127.0.0.1:8080', method='GET', uri='/api/kernels/b2039b53-2ce0-48f1-b327-2b0393a53ff5/channels', version='HTTP/1.1', remote_ip='127.0.0.1')
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/tornado/websocket.py", line 944, in _accept_connection
await open_result
File "/opt/conda/lib/python3.7/asyncio/tasks.py", line 318, in __wakeup
future.result()
concurrent.futures._base.TimeoutError: Timeout
Working case I see, Websocket is opening successfully and:
[I 2022-07-28 10:23:49.698 ServerApp] Kernel started: 84d2de07-7ed5-46ba-8f85-c2bb583be71d
[W 2022-07-28 10:23:49.704 ServerApp] No session ID specified
NOTE: When using the `ipython kernel` entry point, Ctrl-C will not work.
To exit, you will have to explicitly quit this process, by either sending
"quit" from a client, or using Ctrl-\ in UNIX-like environments.
To read more about this, see https://github.com/ipython/ipython/issues/2049
To connect another client to this kernel, use:
--existing /connection-spec
[W 2022-07-28 10:23:53.058 ServerApp] No channel specified, assuming shell: {'header': {'msg_id': '6101c9180e5f11ed969a42010a800005', 'username': 'test', 'session': '6101cc4c0e5f11ed969a42010a800005', 'data': '2022-07-28T10:23:53.054348', 'msg_type': 'execute_request', 'version': '5.0'}, 'parent_header': {'msg_id': '6101c9180e5f11ed969a42010a800005', 'username': 'test', 'session': '6101cc4c0e5f11ed969a42010a800005', 'data': '2022-07-28T10:23:53.054348', 'msg_type': 'execute_request', 'version': '5.0'}, 'metadata': {}, 'content': {'code': 'print(123)', 'silent': False}}
any suggestions?

Use notebook with slurm

I want to be able to connect to our institution's cluster using VS Code Remote SSH without the server running on a compute node instead of the login node. The preferred workflow is to SSH into the login node and then use a command to allocate a job and spin up an interactive shell on a compute node, and then run the jupyter notebook kernel on those node.
I launch jupyter-lab through this bash script
#!/bin/bash
#SBATCH -J jupyterTest
#SBATCH -N 1
#SBATCH --mem=16GB
#SBATCH --time=5:00:00
#SBATCH --output=/home/adufour/work/jupyter.log
#Load necessary modules
module purge
source /home/adufour/.bashrc
conda activate singlecell
#Go to the folder you wanna run jupyter in
cd ~/work
#Start the notebook
jupyter lab --ip=0.0.0.0 --port=8888
Which give me the following log :
[I 2021-02-16 16:14:29.363 ServerApp] jupyterlab | extension was successfully linked.
[I 2021-02-16 16:14:31.551 ServerApp] nbclassic | extension was successfully linked.
[I 2021-02-16 16:14:31.690 LabApp] JupyterLab extension loaded from /work/adufour/anaconda3/envs/singlecell/lib/python3.9/site-packages/jupyterlab
[I 2021-02-16 16:14:31.691 LabApp] JupyterLab application directory is /work/adufour/anaconda3/envs/singlecell/share/jupyter/lab
[I 2021-02-16 16:14:31.698 ServerApp] jupyterlab | extension was successfully loaded.
[I 2021-02-16 16:14:31.714 ServerApp] nbclassic | extension was successfully loaded.
[I 2021-02-16 16:14:31.719 ServerApp] Serving notebooks from local directory: /work/adufour
[I 2021-02-16 16:14:31.719 ServerApp] Jupyter Server 1.3.0 is running at:
[I 2021-02-16 16:14:31.719 ServerApp] http://node126:8888/lab?token=6a1178ff82901e42559423100b13ba12105935432fb0efdd
[I 2021-02-16 16:14:31.719 ServerApp] or http://127.0.0.1:8888/lab?token=6a1178ff82901e42559423100b13ba12105935432fb0efdd
[I 2021-02-16 16:14:31.719 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 2021-02-16 16:14:31.744 ServerApp]
To access the server, open this file in a browser:
file:///home/adufour/.local/share/jupyter/runtime/jpserver-108627-open.html
Or copy and paste one of these URLs:
http://node126:8888/lab?token=6a1178ff82901e42559423100b13ba12105935432fb0efdd
or http://127.0.0.1:8888/lab?token=6a1178ff82901e42559423100b13ba12105935432fb0efdd
Unable to connect to VS Code server.
Error in request

Pyspark kernel doesnt start on Jupyterhub

The Jupyterhub Pyspark kernel used to work very well but now will not start or will have Kernel connected status (doesnt go to idle) but will not run any code in cells. It is using localprocess spawner with PAM auth on Centos.In the jupyterhub logs we see these messages:
[I 2020-03-10 15:13:08.644 SingleUserNotebookApp restarter:110] KernelRestarter: restarting kernel (4/5), new random ports
Assertion failed: rc == 0 (src/socket_poller.cpp:41)
[W 2020-03-10 15:13:11.666 SingleUserNotebookApp restarter:100] KernelRestarter: restart failed
[W 2020-03-10 15:13:11.666 SingleUserNotebookApp kernelmanager:127] Kernel 0148c77d-143a-4721-90e7-0d9a41a878c4 died, removing from map.
Any thoughts?
This was resolved by reinstalling sparkmagic in the base conda environment:
(base) [root#]# pip install sparkmagic

SciRuby not plotting the bars

I don't see output for my SciRuby code. I expected to see output
but I am getting output like the one in below screenshot
Code:
require 'nyaplot'
include Nyaplot
plot = Plot.new
bar = plot.add(:bar, [:a, :b, :c], [3,4,5])
plot
Logs:
➜ Ruby iruby notebook
[TerminalIPythonApp] WARNING | Subcommand `ipython notebook` is deprecated and will be removed in future versions.
[TerminalIPythonApp] WARNING | You likely want to use `jupyter notebook`... continue in 5 sec. Press Ctrl-C to quit now.
[I 17:41:40.578 NotebookApp] Serving notebooks from local directory: /home/abhimanyuaryan/Public/RoR/Ruby
[I 17:41:40.578 NotebookApp] 0 active kernels
[I 17:41:40.578 NotebookApp] The Jupyter Notebook is running at: http://localhost:8888/
[I 17:41:40.578 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
Gtk-Message: Failed to load module "pantheon-filechooser-module"
(firefox:19417): Gtk-WARNING **: Unable to locate theme engine in module_path: "pixmap",
(firefox:19417): Gtk-WARNING **: Unable to locate theme engine in module_path: "pixmap",
(firefox:19417): Gtk-WARNING **: Unable to locate theme engine in module_path: "pixmap",
(firefox:19417): Gtk-WARNING **: Unable to locate theme engine in module_path: "pixmap",
(firefox:19417): Gtk-WARNING **: Unable to locate theme engine in module_path: "pixmap",
Gtk-Message: Failed to load module "canberra-gtk-module"
[I 17:41:55.360 NotebookApp] 302 GET / (127.0.0.1) 0.57ms
[I 17:42:00.302 NotebookApp] Creating new notebook in
[I 17:42:02.225 NotebookApp] Kernel started: b4ef1b18-a8ac-40ea-a833-fb520b24d52f
[I 17:44:02.269 NotebookApp] Saving file at /Untitled1.ipynb
[I 17:46:02.257 NotebookApp] Saving file at /Untitled1.ipynb
[I 17:48:02.257 NotebookApp] Saving file at /Untitled1.ipynb
$ gem install nyaplot
$ gem install gnuplotrb
fixed the issue for me. Thanks to John Wood for the reply

Code deploy reports: "Deployment Failed: No hosts succeeded", while deploying from S3 .zip revision to EC2 instance

I'm trying to make an automated CI workflow from Bitbucket to aws EC2 instance using Jenkins hosted in a separate EC2 instance.
I created and configured everything needed (IAM roles, aws client and code deploy agent) as the following article describes:
https://pranavpshah.wordpress.com/configure-aws-codedeploy/
Btw, all the instances are based on ubuntu and running inside a private VPC, and I'm deploying a node.js application.
For instance, I can successfully create a .zip build in S3 bucket, every time I push to Bitbucket repo. But in Code Deploy dashboard, I get "Deployment Failed No hosts succeeded." error message.
the status "In progress" takes more than 5 min, every time I start the process.
When the deployment process finished with the status failed, I checked /var/log/aws/codedeploy-agent/codedeploy-agent.log file, and here is what I got:
2015-12-04 17:17:36 INFO [codedeploy-agent(28199)]: Stopping master 27971
2015-12-04 17:17:36 INFO [codedeploy-agent(27971)]: master 27971: Received TERM - stopping children and shutting down
2015-12-04 17:17:36 INFO [codedeploy-agent(27975)]: InstanceAgent::Plugins::CodeDeployPlugin::CommandPoller of master 27971: Received TERM - setting internal shutting down flag and possibly finishing last run
2015-12-04 17:17:55 INFO [codedeploy-agent(27975)]: [Aws::CodeDeployCommand::Client 200 60.113784 0 retries] poll_host_command(host_identifier:"arn:aws:ec2:us-west-2:219450671821:instance/i-348913ed")
2015-12-04 17:17:56 INFO [codedeploy-agent(27975)]: InstanceAgent::Plugins::CodeDeployPlugin::CommandPoller of master 27971: shutting down
2015-12-04 17:17:57 INFO [codedeploy-agent(28219)]: master 28219: Spawned child 1/1
2015-12-04 17:17:57 DEBUG [codedeploy-agent(28223)]: Registering Plugins: ["codedeploy"].
2015-12-04 17:17:57 DEBUG [codedeploy-agent(28223)]: Loading plugin codedeploy from /opt/codedeploy-agent/lib/instance_agent/plugins/codedeploy/register_plugin
2015-12-04 17:17:57 DEBUG [codedeploy-agent(28223)]: Registered Plugins: #<Set: {InstanceAgent::Plugins::CodeDeployPlugin::CommandPoller}>.
2015-12-04 17:17:57 INFO [codedeploy-agent(28223)]: On Premises config file does not exist or not readable
2015-12-04 17:17:57 DEBUG [codedeploy-agent(28223)]: InstanceAgent::Plugins::CodeDeployPlugin::CommandPoller: Configuring deploy control client: Region = "us-west-2"
2015-12-04 17:17:57 DEBUG [codedeploy-agent(28223)]: InstanceAgent::Plugins::CodeDeployPlugin::CommandPoller: Deploy control endpoint override = nil
2015-12-04 17:17:57 DEBUG [codedeploy-agent(28223)]: InstanceAgent::Plugins::CodeDeployPlugin::CommandPoller: Initializing Host Agent: Host Identifier = arn:aws:ec2:us-west-2:219450671821:instance/i-348913ed
2015-12-04 17:17:57 DEBUG [codedeploy-agent(28223)]: InstanceAgent::Plugins::CodeDeployPlugin::CommandPoller: Validating CodeDeploy Plugin Configuration
2015-12-04 17:17:57 DEBUG [codedeploy-agent(28223)]: InstanceAgent::Plugins::CodeDeployPlugin::CodeDeployControlCertVerifier: Actual certificate subject is '/C=US/ST=Washington/L=Seattle/O=Amazon.com, Inc./CN=codedeploy-commands.us-west-2.amazonaws.com'
2015-12-04 17:17:57 DEBUG [codedeploy-agent(28223)]: InstanceAgent::Plugins::CodeDeployPlugin::CodeDeployControlCertVerifier: Actual certificate subject is '/C=US/ST=Washington/L=Seattle/O=Amazon.com, Inc./CN=codedeploy-commands.us-west-2.amazonaws.com'
2015-12-04 17:17:57 DEBUG [codedeploy-agent(28223)]: InstanceAgent::Plugins::CodeDeployPlugin::CodeDeployControlCertVerifier: Actual certificate subject is '/C=US/ST=Washington/L=Seattle/O=Amazon.com, Inc./CN=codedeploy-commands.us-west-2.amazonaws.com'
2015-12-04 17:17:57 DEBUG [codedeploy-agent(28223)]: InstanceAgent::Plugins::CodeDeployPlugin::CommandPoller: CodeDeploy Plugin Configuration is valid
2015-12-04 17:17:57 DEBUG [codedeploy-agent(28223)]: InstanceAgent::Plugins::CodeDeployPlugin::CommandPoller: Calling PollHostCommand:
2015-12-04 17:17:58 INFO [codedeploy-agent(28219)]: Started master 28219 with 1 children
2015-12-04 17:18:58 INFO [codedeploy-agent(28223)]: [Aws::CodeDeployCommand::Client 200 60.534255 0 retries] poll_host_command(host_identifier:"arn:aws:ec2:us-west-2:219450671821:instance/i-348913ed")
Am I missing something in the configuration ?
Any help please?
I just checked the detailed information for the deployment Id "d-FEPBDKJMC", seems the instance id is "arn:aws:ec2:us-west-2:219450671821:instance/i-d796060e" instead of "arn:aws:ec2:us-west-2:219450671821:instance/i-348913ed" inside the host agent log that you pasted. So probably should check the log on the right instance.
All the lifecycle events are skipped for the deployment, and I suspect the host agent is not pulling command at all. Since you mentioned that the instance is under a VPC, please make sure Codedeploy and S3 endpoints are whitelisted(We need to connect to these endpoints to do deployments). There is also a doc here about Codedeploy working with VPC, please click Security: https://aws.amazon.com/codedeploy/faqs/.
Are you sure the instance has a ROLE to access S3? I missed that out and did not have the role attached to my instance to access S3.
See http://docs.aws.amazon.com/codedeploy/latest/userguide/how-to-create-iam-instance-profile.html#getting-started-create-ec2-role-console