pyLDAvis visualization from gensim not displaying the result in google colab - visualization

import pyLDAvis.gensim
# Visualize the topics
pyLDAvis.enable_notebook()
vis = pyLDAvis.gensim.prepare(lda_model, corpus, id2word)
vis
The above code displayed the visualization of LDA model in google colab but then after reopening the notebook it stopped displaying.
I even tried
pyLDAvis.display(vis, template_type='notebook')
still not working
When I set
pyLDAvis.enable_notebook(local=True)
it does display the result but not the labels.. Any help would be appreciated!!

when you install LDAvis make sure to specify the version to be 2.1.2 with:
!pip install pyLDAvis==2.1.2
the new versions don't seem to play well with colab.

they changed the package name. use it like:
import pyLDAvis.gensim_models
vis = pyLDAvis.gensim_models.prepare(lda_model, corpus, id2word)
vis

Related

I’m wanting to find the equivalent of "describe history" for databricks in pyspark. Does such a thing exist?

Title says it all really, I'm trying to find the latest version of a delta table but because im testing locally I dont have access to data-bricks.
Tried googling but not much luck im afraid.
If you configure SparkSession correctly as described in the documentation, then you can run SQL commands as well. But you can also access history using the Python or Scala APIs (see docs), like this:
from delta.tables import *
deltaTable = DeltaTable.forPath(spark, pathToTable)
fullHistoryDF = deltaTable.history()

"ModuleNotFoundError: No module named 'PIL'

I have the same problem as was asked by trevor however the answers don't assist me at all.
I'm on Window 10, running Pyscriper 3.6.3.0 and Python 3.8.2 in 32 bit.
I've searched the Pillow site and those instruction just result in a different error where pip is invalid syntax. The biggest issue I'm finding is that there is way too much out of date on Google and the forums.
I was of the belief that Pillow already came with 3.8.2?
from tkinter import *
from PIL import ImageTk,Image
root = Tk()
root.title('TimeLord Frames')
root.iconbitmap('TBA') # Still need to work on icon.
frame = LabelFrame(root, text="This is my Frame.., padx=5, pady=5")
frame.pack(padx=10, pady=10)
b = Button(frame, text="Click Here")
b.pack()
root.mainloop()
I'm pretty sure I have found the problem. It comes from the line:
root.iconbitmap('TBA') # Still need to work on icon.
Because I haven't resolved an issue with the icon, it is looking at "from PIL import ImageTk,Image" and not connecting the 2 together. I have tried inserting the location of my icon, but it's not happy with that either. If I # out bothe line refering to images, I can get the program to run.
Cheers

Where is the high-level charts in bokeh's last version (1.0.3)?

I hava a quick question:
I see that you have the high-level charts in the version 0.11.0:
http://docs.bokeh.org/en/0.11.0/docs/user_guide/charts.html
But I can't find the same topic in the last version (1.0.3)? Did bokeh team remove it?
I have been working on adapting histograms to my project but can't find the histograms section in the last version?
Any ideas?
The bokeh.charts API was deprecated and scheduled to be removed in version 1.0 several years ago. However, even that schedule was also eventually accelerated, due to lack of interest and better alternatives. It has been completely gone from the project since late 2017.
For very high level APIs on top of Bokeh, consider Chartify or Holoviews. Or just create histograms directly with the stable bokeh.plotting API. e.g.
from bokeh.plotting import figure, output_file, show
from bokeh.sampledata.autompg import autompg as df
from numpy import histogram
p = figure(plot_height=300)
hist, edges = histogram(df.hp, density=True, bins=20)
p.quad(top=hist, bottom=0, left=edges[:-1], right=edges[1:], line_color="white")
show(p)

Configure the backend of Ipython to use retina display mode with code

I am using code to configure Jupyter notebooks because I have a repo with plenty of notebooks and want to keep style consistency across all without having to write lengthy setting at the start of each. This way, what I do is having a method to configure the CSS, one to set up Matplotlib and one to configure Ipython.
The reasons I configure my notebooks this way rather than relying on a configuration file as per docs are two:
I am sharing this repo of notebooks publicly and I want all my configs to be visible
I want to keep these configs specific to just this repo I'm creating
As an example, the method to set the CSS looks like
def set_css_style(css_file_path='../styles_files/custom.css'):
styles = open(css_file_path, "r").read()
return HTML(styles)
and I call it at the start of each notebook with set_css_style(). Similarly, I have this method to configure the specifics of Ipython:
def config_ipython():
InteractiveShell.ast_node_interactivity = "all"
Both the above use imports
from IPython.core.display import HTML
from IPython.core.interactiveshell import InteractiveShell
At the moment, as can be seen, the method to configure Ipython only contains the instruction to make it so that when I type the name of variables in multiple lines in a cell I don't need to add a print to make them all be printed.
My question is how to transform the Jupyter magic command to obtain retina-display quality for figures into code. Such command is
%config InlineBackend.figure_format = 'retina'
From the docs of Ipython I can't find how to call this instruction in a method, namely can't find where InlineBackend lives.
I'd just like to add this configuration line to my config_ipython method above, is it possible?
There is a Python API for this:
from IPython.display import set_matplotlib_formats
set_matplotlib_formats('retina')

programmatically add cells to an ipython notebook for report generation

I have seen a few of the talks by iPython developers about how to convert an ipython notebook to a blog post, a pdf, or even to an entire book(~min 43). The PDF-to-X converter interprets the iPython cells which are written in markdown or code and spits out a newly formatted document in one step.
My problem is that I would like to generate a large document where many of the figures and sections are programmatically generated - something like this. For this to work in iPython using the methods above, I would need to be able to write a function that would write other iPython-Code-Blocks. Does this capability exist?
#some pseudocode to give an idea
for variable in list:
image = make_image(variable)
write_iPython_Markdown_Cell(variable)
write_iPython_Image_cell(image)
I think this might be useful so I am wondering if:
generating iPython Cells through iPython is possible
if there is a reason that this is a bad idea and I should stick to a 'classic' solution like a templating library (Jinja).
thanks,
zach cp
EDIT:
As per Thomas' suggestion I posted on the ipython mailing list and got some feedback on the feasibility of this idea. In short - there are some technical difficulties that make this idea less than ideal for the original idea. For a repetitive report where you would like to generate markdown -cells and corresponding images/tables it is ore complicated to work through the ipython kernel/browser than to generate a report directly with a templating system like Jinja.
There's a Notebook gist by Fernando Perez here that demonstrates how to programmatically create new cells. Note that you can also pass metadata in, so if you're generating a report and want to turn the notebook into a slideshow, you can easily indicate whether the cell should be a slide, sub-slide, fragment, etc.
You can add any kind of cell, so what you want is straightforward now (though it probably wasn't when the question was asked!). E.g., something like this (untested code) should work:
from IPython.nbformat import current as nbf
nb = nbf.new_notebook()
cells = []
for var in my_list:
# Assume make_image() saves an image to file and returns the filename
image_file = make_image(var)
text = "Variable: %s\n![image](%s)" % (var, image_file)
cell = nbf.new_text_cell('markdown', text)
cells.append(cell)
nb['worksheets'].append(nbf.new_worksheet(cells=cells))
with open('my_notebook.ipynb', 'w') as f:
nbf.write(nb, f, 'ipynb')
I won't judge whether it's a good idea, but if you call get_ipython().set_next_input(s) in the notebook, it will create a new cell with the string s. This is what IPython uses internally for its %load and %recall commands.
Note that the accepted answer by Tal is a little deprecated and getting more deprecated: in ipython v3 you can (/should) import nbformat directly, and after that you need to specify which version of notebook you want to create.
So,
from IPython.nbformat import current as nbf
becomes
from nbformat import current as nbf
becomes
from nbformat import v4 as nbf
However, in this final version, the compatibility breaks because the write method is in the parent module nbformat, where all of the other methods used by Fernando Perez are in the v4 module, although some of them are under different names (e.g. new_text_cell('markdown', source) becomes new_markdown_cell(source)).
Here is an example of the v3 way of doing things: see generate_examples.py for the code and plotstyles.ipynb for the output. IPython 4 is, at time of writing, so new that using the web interface and clicking 'new notebook' still produces a v3 notebook.
Below is the code of the function which will load contents of a file and insert it into the next cell of the notebook:
from IPython.display import display_javascript
def make_cell(s):
text = s.replace('\n','\\n').replace("\"", "\\\"").replace("'", "\\'")
text2 = """var t_cell = IPython.notebook.get_selected_cell()
t_cell.set_text('{}');
var t_index = IPython.notebook.get_cells().indexOf(t_cell);
IPython.notebook.to_code(t_index);
IPython.notebook.get_cell(t_index).render();""".format(text)
display_javascript(text2, raw=True)
def insert_file(filename):
with open(filename, 'r') as content_file:
content = content_file.read()
make_cell(content)
See details in my blog.
Using the magics can be another solution. e.g.
get_ipython().run_cell_magic(u'HTML', u'', u'<font color=red>heffffo</font>')
Now that you can programatically generate HTML in a cell, you can format in any ways as you wish. Images are of course supported. If you want to repetitively generate output to multiple cells, just do multiple of the above with the string to be a placeholder.
p.s. I once had this need and reached this thread. I wanted to render a table (not the ascii output of lists and tuples) at that time. Later I found pandas.DataFrame is amazingly suited for my job. It generate HTML formatted tables automatically.
from IPython.display import display, Javascript
def add_cell(text, type='code', direct='above'):
text = text.replace('\n','\\n').replace("\"", "\\\"").replace("'", "\\'")
display(Javascript('''
var cell = IPython.notebook.insert_cell_{}("{}")
cell.set_text("{}")
'''.format(direct, type, text)));
for i in range(3):
add_cell(f'# heading{i}', 'markdown')
add_cell(f'code {i}')
codes above will add cells as follows:
#xingpei Pang solution is perfect, especially if you want to create customized code for each dataset having several groups for instance. However, the main issue with the javascript code is that if you run this code in a trusted notebook, it runs every time the notebook is loaded.
The solution I came up with is to clear the cell output after execution. The javascript code is stored in the output cell, so by clearing the output the code is gone and nothing is left to be executed in the trusted mode again. By using the code from here, the solution is the code below.
from IPython.display import display, Javascript, clear_output
def add_cell(text, type='code', direct='above'):
text = text.replace('\n','\\n').replace("\"", "\\\"").replace("'", "\\'")
display(Javascript('''
var cell = IPython.notebook.insert_cell_{}("{}")
cell.set_text("{}")
'''.format(direct, type, text)));
# create cells
for i in range(3):
add_cell(f'# heading{i}', 'markdown')
add_cell(f'code {i}')
# clean the javascript code from the current cell output
for i in range(10):
clear_output(wait=True)
Note that the clear_output() needs the be run several times to make sure the output is cleared.
As a slight update incorporating Tal's answer above, updates from Chris Barnes and a little digging in the nbformat docs, the following worked for me:
import nbformat
from nbformat import v4 as nbf
nb = nbf.new_notebook()
cells = [
nbf.new_code_cell(f"""print("Doing the thing: {i}")""")
for i in range(10)
]
nb.cells.extend(cells)
with open('generated_notebook.ipynb', 'w') as f:
nbformat.write(nb, f)
You can then start up the new artificial notebook and cut-n-paste cells where ever you need them.
This is unlikely to be the best way to do anything, but it's useful as a dirty hack. 🐱‍💻
This worked with the following versions:
Package Version
-------------------- ----------
ipykernel 5.3.0
ipython 7.15.0
jupyter 1.0.0
jupyter-client 6.1.3
jupyter-console 6.1.0
jupyter-core 4.6.3
nbconvert 5.6.1
nbformat 5.0.7
notebook 6.0.3
...
Using the command line goto the directory where the myfile.py file is located
and execute (Example):
C:\MyDir\pip install p2j
Then execute:
C:\MyDir\p2j myfile.py -t myfile.ipynb
Run in the Jupyter notebook:
!pip install p2j
Then, using the command line, go the corresponding directory where the file is located and execute:
python p2j <myfile.py> -t <myfile.ipynb>