Set slaves for jupyterq - jupyter

I am running Q in jupyter. To enable multiple slaves, jupyter requires to be run with parameter -s.
How can I launch a kernel in jupyerq with slaves enabled?
I tried passing -s 20 into argv here in runkernel.py, but the kernel wouldn't start at all
os.execvp('/bin/qlib/3.6.x86_64/q', ['/bin/qlib/3.6.x86_64/q', jupyterq_kernel.q', '-cds'] + argv)

You can do this as follows:
Find the location of your kernel.json file for qpk by running
$jupyter kernelspec list
qpk /Users/anaconda3/share/jupyter/kernels/qpk
Open the kernel.json file and modify the following line as outlined to set up the notebook to initialise 20 slaves threads on server startup
"env": {"JUPYTERQ_SERVERARGS":"","MPLBACKEND":"Agg"}
Changed to
"env": {"JUPYTERQ_SERVERARGS":"-s 20","MPLBACKEND":"Agg"}
For reference instructions for completing command line argument execution for jupyterq can be found here:
https://code.kx.com/v2/ml/jupyterq/notebooks/#server-command-line-arguments

Related

How to minimize Eclipse PyDev Console output / tracing

I have multiple installations of Eclipse(2021-12) + PyDev(9.3.0.202203051235) all using Iron Python(2.7). All running on Windows 10. They all run the scripts as expected, but one installation has a much more robust console output when debugging, almost like a tracing option is enabled. I've tried reinstalling, deleting workspaces, deleting '.metadata' folders, etc. All the project settings seem identical.
Any ideas how to minimize the console output? Something in registry?
Expected Console output:
pydev debugger: starting (pid: 15312)
Actual Console output:
1.99s - Using GEVENT_SUPPORT: False
0.00s - Using GEVENT_SHOW_PAUSED_GREENLETS: False
0.00s - pydevd __file__: C:\\Eclipse-2021-12-R\plugins\org.python.pydev.core_9.3.0.202203051235\pysrc\pydevd.py
0.11s - Initial arguments: (['C:\\Eclipse-2021-12-R\\plugins\\org.python.pydev.core_9.3.0.202203051235\\pysrc\\pydevd.py', '--multiprocess', '--protocol-http', '--print-in-debugger-startup', '--vm_type', 'python', '--client', '127.0.0.1', '--port', '60413', '--file', 'C:\\Test.py'],)
0.00s - Current pid: 8884
pydev debugger: starting (pid: 8884)
Those should only appear if you add an environment variable asking it to be shown.
i.e.: Something as:
PYDEVD_DEBUG=1
PYDEV_DEBUG=1
Maybe you have such an environment set in your launch configuration or interpreter configuration or elsewhere in your system?
You may want to check the os.environ of the running program to see what's set there.

Can run yarn command from one Power Shell window, but not another. What could possibly explain the difference?

SUCCESS
I start PowerShell from the File explorer: C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe. I change directory to where I want to run the yarn command. I run yarn build which starts the script rollup --config --environment NODE_ENV:production. The application builds successfully.
FAIL
I start PowerShell from the start menu: C:\Users\Me\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Windows PowerShell\Windows PowerShell.lnk. This shortcut targets the executable above: %SystemRoot%\system32\WindowsPowerShell\v1.0\powershell.exe. I change directory to where I want to run the yarn command. I run yarn build. The build errors.
src/renderer/renderer.tsx → app/build...
[!] Error: Unexpected token (Note that you need plugins to import files that are not JavaScript)
src\renderer\renderer.tsx (6:9)
4:
5: document.addEventListener("DOMContentLoaded", () =>
6: render(<App />, document.getElementById("root"))
^
7: );
Error: Unexpected token (Note that you need plugins to import files that are not JavaScript)
at error (C:\Users\Me\Dev\graphics\svgconverter\node_modules\rollup\dist\shared\rollup.js:217:30)
at Module.error (C:\Users\Me\Dev\graphics\svgconverter\node_modules\rollup\dist\shared\rollup.js:15145:16)
at tryParse (C:\Users\Me\Dev\graphics\svgconverter\node_modules\rollup\dist\shared\rollup.js:15034:23)
at Module.setSource (C:\Users\Me\Dev\graphics\svgconverter\node_modules\rollup\dist\shared\rollup.js:15436:30)
at ModuleLoader.addModuleSource (C:\Users\Me\Dev\graphics\svgconverter\node_modules\rollup\dist\shared\rollup.js:17434:20)
at ModuleLoader.fetchModule (C:\Users\Me\Dev\graphics\svgconverter\node_modules\rollup\dist\shared\rollup.js:17495:9)
at async Promise.all (index 0)
at async Promise.all (index 0)
error Command failed with exit code 1.
LOOKING FOR DIFFERENCES
I have checked the file properties of the shortcut. It does not open the target as administrator. The owner is EUR/{Me}. System, Me and Administrators have all permissions checked except for special permissions. I think this is should not lead to any difference in behavior from starting PowerShell without using this shortcut?
I have printed the environment variables on both shells using dir env: | Format-Table -Wrap and they are completely equal.
So now I am scratching my head. What could possibly explain the difference?

Jupyter ImportError: No module named py4j.protocol despite py4j is installed

I read some posts regarding to the error I am seeing now when import pyspark, some suggest to install py4j, and I already did, and yet I am still seeing the error.
I am using a conda environment, here is the steps:
1. create a yml file and include the needed packages (including the py4j)
2. create a env based on the yml
3. create a kernel pointing to the env
4. start the kernel in Jupyter
5. running `import pyspark` throws error: ImportError: No module named py4j.protocol
The issue is resolved with adding environment section in kernel.json and explicitely specify the variables of the following:
"env": {
"HADOOP_CONF_DIR": "/etc/spark2/conf/yarn-conf",
"PYSPARK_PYTHON":"/opt/cloudera/parcels/Anaconda/bin/python",
"SPARK_HOME": "/opt/cloudera/parcels/SPARK2",
"PYTHONPATH": "/opt/cloudera/parcels/SPARK2/lib/spark2/python/lib/py4j-0.10.7-src.zip:/opt/cloudera/parcels/SPARK2/lib/spark2/python/",
"PYTHONSTARTUP": "/opt/cloudera/parcels/SPARK2/lib/spark2/python/pyspark/shell.py",
"PYSPARK_SUBMIT_ARGS": " --master yarn --deploy-mode client pyspark-shell"
}

setting max_old_space_size parameter in SailsJS

I am using sails and frequently got error FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory. Suggestion in stackOverflow is to use like the following: node --max_old_space_size=2000 server.js.
How do I set this in Sails?
To set Node.js heap size when starting your sails app use:
# node --max_old_space_size=4096 app.js
As mentioned in the comments, the same is true when starting your app using forever or pm2 libraries.
For pm2:
# pm2 start app.js --node-args="--max-old-space-size=4096"
For forever:
# forever start -c "node --max_old_space_size=512" -o out.log -e error.log app.js
For the development build you could change "lift" command in package.json
From this line:
"lift": "sails lift"
To the example below:
"lift": "node --max_old_space_size=8000 ./node_modules/.bin/sails lift",
And then run "npm run lift" as usual

How to import modules in IPython Clusters

I am trying to import some of my personal modules into my IPython Clusters. I am using Anacondas on Windows Vista 64 bit
from IPython.parallel import Client
rc = Client()
dview = rc[:]
with dview.sync_imports():
import lib.rf
It is giving me this error:
No module named 'lib.rf'
I can import the module in the rest of my IPython notebook, as I have this .bat file to start ipython notebook:
cd C:\Users\Jon\workspace\bf
set PYTHONPATH=%PYTHONPATH%;C:\Users\Jon\workspace\bf
C:\Anaconda\envs\p33\scripts\ipython notebook
I am using this similar code to start my ip clusters:
cd C:\Users\Jon\workspace\bf
set PYTHONPATH=%PYTHONPATH%;C:\Users\Jon\workspace\bf
C:\Anaconda\envs\p33\Scripts\ipcluster start --n=7
Why is this not working?
More info:
If I print out sys.path, I get a list that contains C:\Users\Jon\workspace\bf
If I print out the paths of my clusters, I get the same list:
%px sys.path
['',
'',
'',
'C:\\Anaconda\\envs\\p33\\lib\\site-packages\\distribute-0.6.28-py3.3.egg',
'C:\\Anaconda\\envs\\p33\\lib\\site-packages\\pykalman-0.9.5-py3.3.egg',
'C:\\Anaconda\\envs\\p33\\lib\\site-packages\\patsy-0.2.1-py3.3.egg',
'C:\\Anaconda\\envs\\p33\\lib\\site-packages\\joblib-0.8.3_r1-py3.3.egg',
'C:\\Users\\Jon\\workspace\\bf',
'C:\\Users\\Jon\\workspace\\bf\\my_numba',
'C:\\Anaconda\\envs\\p33\\python33.zip',
'C:\\Anaconda\\envs\\p33\\DLLs',
'C:\\Anaconda\\envs\\p33\\lib',
'C:\\Anaconda\\envs\\p33',
'C:\\Anaconda\\envs\\p33\\lib\\site-packages',
'C:\\Anaconda\\envs\\p33\\lib\\site-packages\\Sphinx-1.2.3-py3.3.egg',
'C:\\Anaconda\\envs\\p33\\lib\\site-packages\\win32',
'C:\\Anaconda\\envs\\p33\\lib\\site-packages\\win32\\lib',
'C:\\Anaconda\\envs\\p33\\lib\\site-packages\\Pythonwin',
'C:\\Anaconda\\envs\\p33\\lib\\site-packages\\runipy-0.1.1-py3.3.egg',
'C:\\Anaconda\\envs\\p33\\lib\\site-packages\\setuptools-7.0-py3.3.egg',
'C:\\Anaconda\\envs\\p33\\lib\\site-packages\\IPython\\extensions']
In [45]:
Further analysis:
%px lib.__path__
Out[0:11]: _NamespacePath(['C:\\Anaconda\\envs\\p33\\lib\\site-packages\\win32\\lib'])
lib.__path__
Out[57]: ['.\\lib']
Looks like the ipcluster and notebook are looking at lib in different places. I have tried renaming lib to mylib. It has not helped.
It seems that with dview.sync_imports() is being run someplace other than your IPython Notebook environment and is therefore relying a different PYTHONPATH. It is definitely not being run on one of the cluster engines and so wouldn't expect it to leverage your cluster settings of PYTHONPATH.
I'm thinking you'll need to have that directory in your PYTHONPATH (not your PATH) for the calling python environment because that is the location from which you are importing the modules.
The impact of the bit you have about setting the PYTHONPATH in the DOS shell from which you invoke ipclusters isn't clear to me. I can see that one might expect this to let the engines know about your directory, but I'm wondering if that PYTHONPATH gets initilized to the environment from which you call IPython.parallel.Client.