How can I open an IPython notebook without the output? - ipython

I have an IPython notebook where I've accidentally dumped a huge output (15 MB) that crashed the notebook. Now when I open the notebook and attempt to delete the troublesome cell, the notebook crashes again—thus preventing me from fixing the problem and restoring the notebook to stability.
The best fix I can think of is manually pasting the input cells to a new notebook, but is there a way to just open the notebook without any outputs?

you can use cli to clear outputs
jupyter nbconvert --ClearOutputPreprocessor.enabled=True --inplace Notebook.ipynb

There is this nice snippet (that I use as a git commit hook) to strip the output of an ipython notebook:
#!/usr/bin/env python
def strip_output(nb):
for ws in nb.worksheets:
for cell in ws.cells:
if hasattr(cell, "outputs"):
cell.outputs = []
if hasattr(cell, "prompt_number"):
del cell["prompt_number"]
if __name__ == "__main__":
from sys import stdin, stdout
from IPython.nbformat.current import read, write
nb = read(stdin, "ipynb")
strip_output(nb)
write(nb, stdout, "ipynb")
stdout.write("\n")
You can easily make it a bit nicer to use, currently you'd have to call it as
strip_output.py < my_notebook.ipynb > my_notebook_stripped.ipynb

If you are running jupyter 4.x, you will get some API deprecation warnings when running filmor's script. Although the script still works, I update the script a bit to remove the warnings.
#!/usr/bin/env python
def strip_output(nb):
for cell in nb.cells:
if hasattr(cell, "outputs"):
cell.outputs = []
if hasattr(cell, "prompt_number"):
del cell["prompt_number"]
if __name__ == "__main__":
from sys import stdin, stdout
from nbformat import read, write
nb = read(stdin, 4)
strip_output(nb)
write(nb, stdout, 4)
stdout.write("\n")

As for later versions of jupyter, there is a Restart Kernel and Clear All Outputs... option that clears the outputs but also removed the variables.

Here is a further modification from #Edward Fung's answer that will output the cleaned notebook to a new file rather than rely on stin and stout
from nbformat import read, write
def strip_output(nb):
for cell in nb.cells:
if hasattr(cell, "outputs"):
cell.outputs = []
if hasattr(cell, "prompt_number"):
del cell["prompt_number"]
nb = read(open("my_notebook.ipynb"), 4)
strip_output(nb)
write(nb, open("my_notebook_cleaned.ipynb", "w"), 4)

Using the --ClearOutputPreprocessor, you can reduce the size of your notebook file due to the outputs.
jupyter nbconvert --ClearOutputPreprocessor.enabled=True --inplace sample.ipynb
Note that --clear-output is the broken command like below:
jupyter nbconvert --clear-output --inplace sample.ipynb
In my case, I tried to see the answer to this question, but I found out that it is a command that cannot remove output.

I am not able to post a commenet, so feel free to edit/move to #Shumaila Ahmed answer.
I had to use quotes on the file path, as:
jupyter nbconvert --ClearOutputPreprocessor.enabled=True --inplace 'Notebook.ipynb'
Works like charm on Ubuntu 21.04, thanks!

Related

execute jupyter notebook and keep attached to existing kernel

I know about nbconvert and use it to generate static html or ipynb files with the results output. However, I want to be able to generate a notebook that stays attached to a kernel that I already have running, so that I can do further data exploration after all of the template cells have been run. Is there a way to do that?
Apparently, you can do this through the Python API. I didn't try it myself, but for someone who will be looking for a solution, this PR has an example in the comments:
from nbconvert.preprocessors.execute import executenb, ExecutePreprocessor
from nbformat import read as nbread
from jupyter_client.manager import start_new_kernel
nb = nbread('parsee.ipynb', as_version=4)
kernel_name = nb.metadata.get('kernelspec', {}).get('name', 'python')
km, kc = start_new_kernel(kernel_name=kernel_name)
executenb(nb, kernel=(km, kc))
kc.execute_interactive('a') # a is a variable defined in parsee.ipynb with 'a = 1'
Not quite sure about your purpose. But my general solutions are,
to execute the notebook in command line and see the execution at the same time,
jupyter nbconvert --debug --allow-errors --stdout --execute test.ipynb
this will show the execute through all cells in debug mode even exception happens. but I can't see the result until the end of the execution.
to output the result to a html file, and then open the html file to see the results. I found this is more convenient.
jupyter nbconvert --execute --allow-errors --stdout test.ipynb >> result.html 2>&1
if you open result.html, it will be,
and all the errors and results will be shown on the page.
I would like to learn other answers/solutions from you all. thank you.
If I understood correctly you wish to open a Python console, and connect Jupyter notebook to that kernel instance?
Perhaps your solution would be to edit jupyter scripts itself and run the server in separate thread/background task implementing some sort of connection between threads and work in the jupyter console? Currently it's impossible because main thread is running the server.
This would require some work and I don't have any solution as-is, but I will look into that and maybe edit this answer if I can make it work.
But it seems that the easiest solution is to simply add another field in the notebook and do whatever you wish to do there. Is there a reason for not doing that?

ipython notebook 11 times slower than python: why?

I am running a script in ipython notebook (with Chrome) and noticed that it's 11 times slower than it is if I run the very same script in Python, using spyder as my IDE.
The script is quite simple: it's just a set of loops and calculations on a pandas dataframe. No output is printed to the screen nor written to external files. I expect the code to be slowish because it's not vectorised, I appreciate Ipython may involve some overhead, but 11 times... ! Can you think of any reasons why? Any suggestions?
Thanks!
I tested this on my machine, and found that ipython was actually faster.
$ cat ex.py
import time
import numpy as np
now = time.time() #(seconds)
a = []
for j in range(2):
for s in range(10):
a.append(np.random.random())
then = now
print(time.time() - then)
$ python ex.py
0.142902851105
In [1]: %run ex.py
0.06136202812194824
I bet it's the Chrome part of your ipython setup which is causing the slowdown.

Simultaneously displaying and capturing standard out in IPython?

I'm interested in implementing a behavior in IPython that would be like a combination of ! and !!. I'm trying to use an IPython terminal as an adjunct to my (Windows) shell. For a long running command (e.g., a build script) I would like to be able to watch the output as it streams by as ! does. I would also like to capture the output of the command into the output history as !! does, but this defers printing anything until all output is available.
Does anyone have any suggestions as to how to implement something like this? I'm guessing that a IPython.utils.io.Tee() object would be useful here, but I don't know enough about IPython to hook this up properly.
Here is a snippet of code I just tried in iPython notebook v2.3, which seems to do what was requested:
import sys
import IPython.utils.io
outputstream = IPython.utils.io.Tee("outputfile.log", "w", channel="stdout")
outputstream.write("Hello worlds!\n")
outputstream.close()
logstream=open("outputfile.log", "r")
sys.stdout.write("Read back from log file:\n")
sys.stdout.write(logstream.read())
The log file is created in the same directory as the iPython notebook file, and the output from running this cell is displayed thus:
Hello worlds!
Read back from log file:
Hello worlds!
I haven't tried this in the iPython terminal, but see no reason it wouldn't work as well there.
(Researched and answered as part of the Oxford participation in http://aaronswartzhackathon.org)

Running an IPython/Jupyter notebook non-interactively

Does anyone know if it is possible to run an IPython/Jupyter notebook non-interactively from the command line and have the resulting .ipynb file saved with the results of the run. If it isn't already possible, how hard would it be to implement with phantomJS, something to turn the kernel on and off, and something to turn the web server on and off?
To be more specific, let's assume I already have a notebook original.ipynb and I want to rerun all cells in that notebook and save the results in a new notebook new.ipynb, but do this with one single command on the command line without requiring interaction either in the browser or to close the kernel or web server, and assuming no kernel or web server is already running.
example command:
$ ipython notebook run original.ipynb --output=new.ipynb
Yes it is possible, and easy, it will (mostly) be in IPython core for 2.0, I would suggest looking at those examples for now.
[edit]
$ jupyter nbconvert --to notebook --execute original.ipynb --output=new.ipynb
It is now in Jupyter NbConvert. NbConvert comes with a bunch of Preprocessors that are disabled by default, two of them (ClearOutputPreprocessor and ExecutePreprocessor) are of interest. You can either enabled them in your (local|global) config file(s) via c.<PreprocessorName>.enabled=True (Uppercase that's python), or on the command line with --ExecutePreprocessor.enabled=True keep the rest of the command as usual.
The --ExecutePreprocessor.enabled=True has convenient --execute alias that can be used on recent version of NbConvert. It can be combine with --inplace if desired
For example, convert to html after running the notebook headless :
$ jupyter nbconvert --to=html --execute RunMe.ipynb
converting to PDF after stripping outputs
$ ipython nbconvert --to=pdf --ClearOutputPreprocessor.enabled=True RunMe.ipynb
This (of course) does work with non-python kernels by spawning a <insert-your-language-here> kernel, if you set --profile=<your fav profile>. The conversion can be really long as it needs to rerun the notebook. You can do notebook to notebook conversion with the --to=notebook option.
There are various other options (timeout, allow errors, ...) that might need to be set/unset depending on use case. See documentation and of course jupyter nbconvert --help, --help-all, or nbconvert online documentation for more information.
Until this functionality becomes part of the core, I put together a little command-line app that does just what you want. It's called runipy and you can install it with pip install runipy. The source and readme are on github.
Run and replace original .ipynb file:
jupyter nbconvert --ExecutePreprocessor.timeout=-1 --to notebook --inplace --execute original.ipynb
To cover some features such as parallel workers, input parameters, e-mail sending or S3 input/output... you can install jupyter-runner
pip install jupyter-runner
Readme on github: https://github.com/omar-masmoudi/jupyter-runner
One more way is to use papermill, it has Command Line Interface
Usage example: (you need to specify output path for execution results to be stored)
papermill your_notebook.ipynb logs/yourlog.out.ipynb
You also can specify required params if you wish with -p flag for each param:
papermill your_notebook.ipynb logs/yourlog.out.ipynb -p env "prod" -p tests "e2e"
one more related to papermill reply - https://stackoverflow.com/a/55458141/2957102
You can just run the iPython-Notebook-server via command line:
ipython notebook --pylab inline
This will start the server in non-interactive mode and all output is printed below the code. You can then save the .ipynb-File which includes Code & Output.

IPython notebook in zc.buildout not using eggs path

I've build an environment with zc.buildout including IPython script.
My problem is simple:
if I launch IPython in console, everything is OK and I get all my eggs in sys.path
but if I launch IPython notebook, I only get default system path.
Is there any way to include all my eggs while starting notebook?
Regards,
Thierry
So, I guess somewhere in the notebook startup a process is forked, which means sys.path will get reset and buildout's tricks won't help.
I solved the problems as follows, although it's a bit dirty:
Create an entry point as follows:
setup(...
entry_points = {
'console_scripts': ['ipython = <yourpackage>.ipython:main']
})
Put the following in /ipython.py:
from IPython.frontend.terminal.ipapp import launch_new_instance
import os
import sys
def main():
os.environ['PYTHONPATH']=':'.join(sys.path)
sys.exit(launch_new_instance())
Now, running bin/ipython notebook will give you the sys.path you expect.