When I run jupyter (notebook or qtconsole versions) or ipython, I can see that anywhere between 4-22 threads are used (seen on top). This is just on startup before even doing anything. How do I limit the number of threads used?
I have tried setting:
export MKL_NUM_THREADS=1
export NUMEXPR_NUM_THREADS=1
export OMP_NUM_THREADS=1
but this only limits the number of threads used by numpy, and this seems to be a separate issue.
Related
I am working in Google Vertex AI, which has a two-disk system of a boot disk and a data disk, the latter of which is mounted to /home/jupyter. I am trying to expose python venv environments with kernelspec files, and then keep those environments exposed across repeated stop-start cycles. All of the default locations for kernelspec files are on the boot disk, which is ephemeral and recreated each time the VM is started (i.e., the exposed kernels vaporize each time the VM is stopped). Conceptually, I want to use a VM start-up script to add a persistent data disk path to the JUPYTER_PATH variable, since, according to the documentation, "Jupyter uses a search path to find installable data files, such as kernelspecs and notebook extensions." During interactive testing in the Terminal, I have not found this to be true. I have also tried setting the data directory variable, but it does not help.
export JUPYTER_PATH=/home/jupyter/envs
export JUPYTER_DATA_DIR=/home/jupyter/envs
I have a beginner's understanding of jupyter and of the important ramifications of using two-disk systems. Could someone please help me understand:
(1) Why is Jupyter failing to search for kernelspec files on the JUPYTER_PATH or in the JUPYTER_DATA_DIR?
(2) If I am mistaken about how the search paths work, what is the best strategy for maintaining virtual environment exposure when Jupyter is installed on an ephemeral boot disk? (Note, I am aware of nb_conda_kernels, which I am specifically avoiding)
A related post focused on the start-up script can be found at this url. Here I am more interested in the general Jupyter + two-disk use case.
I use VS Code latest version 1.69.2 , and remote connect to my cloud VM. After one or two days, I found the cpu usage is very high .The detail of the process is :
my-user 18954 17082 0 12:10 ? 00:00:04 /home/my-user/work/.vscode-server/bin/3b889b090b5ad5793f524b5d1d39fda662b96a2a/node /home/my-user/work/.vscode-server/bin/3b889b090b5ad5793f524b5d1d39fda662b96a2a/out/bootstrap-fork --type=extensionHost --transformURIs --useHostProxy=false
There are 8 node processes in total,every cpu usage is large than 50%
The question is:
What process is this?
Why the cpu usage is so high?
When I close all the connected windows and then reconnect to my remote VM, these processes are still here. Why are these processes not closing automatically?
Is this the bug of VSCode 1.69.2?
VSCode uses Node, I suppose it's one of these process used for autocomplete that scans your files, but it's never ending. I have to kill them manually, 1 node process uses 100% of the CPU.
I solved the issue by removing the .vscode-server folder created in my home folder. You have to do this from a remote shell (not from vscode terminal).
I solved the issue by removing the Settings Sync plugin。
The bash file is available here:
https://github.com/bigbluebutton/bbb-install
I had asked this question several days ago and have finally gotten a definitive answer when I had some time earlier today to install Ubuntu on a VM. Basically, it would seem that the WSL does not behave correctly when attempting to set the CPU of a program. On the WSL when I ran my code both with and without sudo, the output was the same, all threads printing seemingly at random. However, on my Ubuntu VM, running the same code without sudo had the same effect, and running it with sudo caused the original goal of the code, and starved the other 2 threads allowing only the the process with the highest affinity (in this case 40) to print.
I was hoping someone here knew the WSL very intimately and could help me achieve the intended behavior on it.
Is it possible to run notebook server with a kernel scheduled as processes on remote cluster (ssh or pbs), (with common directory on NFS)?
For example I have three servers with GPU and would like to run a notebook on one of them, but I do not like to start more than one notebook server. It would be ideal to have notebook server on 4th machine which would in some way scheduele kernels automatically or manually.
I did some trials with making cluster with one engine. Using %%px in each cell is almost a solution, but one cannot use introspection and the notebook code in fact is dependent on the cluster configuration which is not very good.
This is not possible with the notebook at this time. The notebook cannot use a kernel that it did not start.
You could possibly write a new KernelManager that starts kernels remotely distributed across your machines and plug that into the notebook server, but you cannot attach an Engine or other existing kernel to the notebook server.