I am trying (for a long time) to use pyrevit forms to open excel files, but everytime I try to use it, a different error appears. The most recent error is the one in the image.
If I try 'from pyrevit import *', the error is:
Exception : System.MissingMemberException: 'module' object has no attribute 'compat'
Does anyone have any idea what I'm doing wrong? I don't know what else to do... Sorry for my ignorance.
Thank you very much in advance!
new error message:
It looks like some links are missing. Have you tried reinstalling .NET Framework or pyrevit?
The problem may be with from pyrevit import * because it is not in your PATH. I was able to use pyRevit forms by adding its library and additional packages folders like this:
import sys
sys.path.append(r'C:\Users\<username>\AppData\Roaming\pyRevit-Master\pyrevitlib')
sys.path.append(r'C:\Users\<username>\AppData\Roaming\pyRevit-Master\site-packages')
from pyrevit import forms
Just replace <username> and paste into RevitPythonShell, provided pyRevit is installed in the default location. Other pyrevit modules should work similarly.
I am trying to improve accuracy of passport MRZ reading with tesseract ocr and passportEye I have found few github repositories containing "*.traineddata", it says to move it into tesseract ocr tessdata folder, I did that. No where in readme of these repos says how to use it, I believe it is something trivial, but I am very new to this tesseract thing.
How do I use it with passportEye in python, I am completely lost here. searched a lot. Here is the current code.
import os
from passporteye import read_mrz
pr_path = os.getcwd()
file_path = os.path.join(pr_path,'my_app', 'data')
mrz = read_mrz(file_path + '/test1.jpg')
print(mrz)
This is the .traineddata file I want to test for more accuracy : https://github.com/DoubangoTelecom/tesseractMRZ/blob/master/tessdata_best/mrz.traineddata
I do not want to use bulky openCV. Please help
From looking into the source code I would say you can`t, without changing the codebase of PassportEye:
Normally you would pass the language you are using via: -l paramerter to tesseract - in your case:
-l mrz
But the PassportEye implementation does not give you that option:
https://github.com/konstantint/PassportEye/blob/929c186c4dfa80a1ac975b5f2b95002ca12889d0/passporteye/util/ocr.py#L48
they pass lang=None, you would need to change that part to lang=mrz
pytesseract.run_tesseract(input_file_name,
output_file_name_base,
'txt',
lang='mrz',
config=config)
Im trying to use a created 'bazaar' config file with this format (I tryed setting T and F):
load_system_dawg F
load_freq_dawg F
user_words_suffix user-words
I'm using as Latin.traineddata language and created a Latin.user-words in same directory /tessdata
with some words, like:
Monotributista,
Monotributista (with and without comma)
tesseract without config paramethers game me this, around other words, is a 5 pages text Nfonotributista,
So I tried with the user-words, maybe it can correct that, using this code:
import pytesseract
pytesseract.pytesseract.tesseract_cmd =r'C:\Program Files\Tesseract-OCR\tesseract'
imagen=Image.open("page-1.png")
text=pytesseract.image_to_string(imagen, lang='Latin', config='bazaar')
No errors, but same result, I cannot find much documentation to know what's happening behind, is it using the config? is it trying the OCRed words against the dictionary?
Is there anything wrong on my code?
I appreciate any help
Thank you!
Edit: added some character with bad recognition:
First one detects LIL or LII
Seccond detects LI
I have seen a few of the talks by iPython developers about how to convert an ipython notebook to a blog post, a pdf, or even to an entire book(~min 43). The PDF-to-X converter interprets the iPython cells which are written in markdown or code and spits out a newly formatted document in one step.
My problem is that I would like to generate a large document where many of the figures and sections are programmatically generated - something like this. For this to work in iPython using the methods above, I would need to be able to write a function that would write other iPython-Code-Blocks. Does this capability exist?
#some pseudocode to give an idea
for variable in list:
image = make_image(variable)
write_iPython_Markdown_Cell(variable)
write_iPython_Image_cell(image)
I think this might be useful so I am wondering if:
generating iPython Cells through iPython is possible
if there is a reason that this is a bad idea and I should stick to a 'classic' solution like a templating library (Jinja).
thanks,
zach cp
EDIT:
As per Thomas' suggestion I posted on the ipython mailing list and got some feedback on the feasibility of this idea. In short - there are some technical difficulties that make this idea less than ideal for the original idea. For a repetitive report where you would like to generate markdown -cells and corresponding images/tables it is ore complicated to work through the ipython kernel/browser than to generate a report directly with a templating system like Jinja.
There's a Notebook gist by Fernando Perez here that demonstrates how to programmatically create new cells. Note that you can also pass metadata in, so if you're generating a report and want to turn the notebook into a slideshow, you can easily indicate whether the cell should be a slide, sub-slide, fragment, etc.
You can add any kind of cell, so what you want is straightforward now (though it probably wasn't when the question was asked!). E.g., something like this (untested code) should work:
from IPython.nbformat import current as nbf
nb = nbf.new_notebook()
cells = []
for var in my_list:
# Assume make_image() saves an image to file and returns the filename
image_file = make_image(var)
text = "Variable: %s\n![image](%s)" % (var, image_file)
cell = nbf.new_text_cell('markdown', text)
cells.append(cell)
nb['worksheets'].append(nbf.new_worksheet(cells=cells))
with open('my_notebook.ipynb', 'w') as f:
nbf.write(nb, f, 'ipynb')
I won't judge whether it's a good idea, but if you call get_ipython().set_next_input(s) in the notebook, it will create a new cell with the string s. This is what IPython uses internally for its %load and %recall commands.
Note that the accepted answer by Tal is a little deprecated and getting more deprecated: in ipython v3 you can (/should) import nbformat directly, and after that you need to specify which version of notebook you want to create.
So,
from IPython.nbformat import current as nbf
becomes
from nbformat import current as nbf
becomes
from nbformat import v4 as nbf
However, in this final version, the compatibility breaks because the write method is in the parent module nbformat, where all of the other methods used by Fernando Perez are in the v4 module, although some of them are under different names (e.g. new_text_cell('markdown', source) becomes new_markdown_cell(source)).
Here is an example of the v3 way of doing things: see generate_examples.py for the code and plotstyles.ipynb for the output. IPython 4 is, at time of writing, so new that using the web interface and clicking 'new notebook' still produces a v3 notebook.
Below is the code of the function which will load contents of a file and insert it into the next cell of the notebook:
from IPython.display import display_javascript
def make_cell(s):
text = s.replace('\n','\\n').replace("\"", "\\\"").replace("'", "\\'")
text2 = """var t_cell = IPython.notebook.get_selected_cell()
t_cell.set_text('{}');
var t_index = IPython.notebook.get_cells().indexOf(t_cell);
IPython.notebook.to_code(t_index);
IPython.notebook.get_cell(t_index).render();""".format(text)
display_javascript(text2, raw=True)
def insert_file(filename):
with open(filename, 'r') as content_file:
content = content_file.read()
make_cell(content)
See details in my blog.
Using the magics can be another solution. e.g.
get_ipython().run_cell_magic(u'HTML', u'', u'<font color=red>heffffo</font>')
Now that you can programatically generate HTML in a cell, you can format in any ways as you wish. Images are of course supported. If you want to repetitively generate output to multiple cells, just do multiple of the above with the string to be a placeholder.
p.s. I once had this need and reached this thread. I wanted to render a table (not the ascii output of lists and tuples) at that time. Later I found pandas.DataFrame is amazingly suited for my job. It generate HTML formatted tables automatically.
from IPython.display import display, Javascript
def add_cell(text, type='code', direct='above'):
text = text.replace('\n','\\n').replace("\"", "\\\"").replace("'", "\\'")
display(Javascript('''
var cell = IPython.notebook.insert_cell_{}("{}")
cell.set_text("{}")
'''.format(direct, type, text)));
for i in range(3):
add_cell(f'# heading{i}', 'markdown')
add_cell(f'code {i}')
codes above will add cells as follows:
#xingpei Pang solution is perfect, especially if you want to create customized code for each dataset having several groups for instance. However, the main issue with the javascript code is that if you run this code in a trusted notebook, it runs every time the notebook is loaded.
The solution I came up with is to clear the cell output after execution. The javascript code is stored in the output cell, so by clearing the output the code is gone and nothing is left to be executed in the trusted mode again. By using the code from here, the solution is the code below.
from IPython.display import display, Javascript, clear_output
def add_cell(text, type='code', direct='above'):
text = text.replace('\n','\\n').replace("\"", "\\\"").replace("'", "\\'")
display(Javascript('''
var cell = IPython.notebook.insert_cell_{}("{}")
cell.set_text("{}")
'''.format(direct, type, text)));
# create cells
for i in range(3):
add_cell(f'# heading{i}', 'markdown')
add_cell(f'code {i}')
# clean the javascript code from the current cell output
for i in range(10):
clear_output(wait=True)
Note that the clear_output() needs the be run several times to make sure the output is cleared.
As a slight update incorporating Tal's answer above, updates from Chris Barnes and a little digging in the nbformat docs, the following worked for me:
import nbformat
from nbformat import v4 as nbf
nb = nbf.new_notebook()
cells = [
nbf.new_code_cell(f"""print("Doing the thing: {i}")""")
for i in range(10)
]
nb.cells.extend(cells)
with open('generated_notebook.ipynb', 'w') as f:
nbformat.write(nb, f)
You can then start up the new artificial notebook and cut-n-paste cells where ever you need them.
This is unlikely to be the best way to do anything, but it's useful as a dirty hack. 🐱💻
This worked with the following versions:
Package Version
-------------------- ----------
ipykernel 5.3.0
ipython 7.15.0
jupyter 1.0.0
jupyter-client 6.1.3
jupyter-console 6.1.0
jupyter-core 4.6.3
nbconvert 5.6.1
nbformat 5.0.7
notebook 6.0.3
...
Using the command line goto the directory where the myfile.py file is located
and execute (Example):
C:\MyDir\pip install p2j
Then execute:
C:\MyDir\p2j myfile.py -t myfile.ipynb
Run in the Jupyter notebook:
!pip install p2j
Then, using the command line, go the corresponding directory where the file is located and execute:
python p2j <myfile.py> -t <myfile.ipynb>
I need some help with Stata. I'm not sure if this is the right forum, but hopefully somebody can help me.
The problem occur, when I want to use new commands in stata. I will explain it with an example: command outreg. I assume the problem is the version.
Stata Details:
Version 10.1
Unlimited-user Stata for Windows (network) perpetual license (decompressed in C:\Program Files (x86)\Stata)
I downloaded the command ssc install outreg
I tried the new command with the example given here:
http://www.ats.ucla.edu/stat/stata/faq/outreg.htm
After execution, the following error occur, after outreg using test.doc, nolabel replace
MakeSmat(): 3499 _CColJoin() not found
CalcStats(): - function returned error
<istmt>: - function returned error
Stata.com also provide a solution for the problem:
http://www.stata.com/statalist/archive/2011-07/msg01018.html but a restart of stata doesn't work for my problem.
The necessary library (l_cfrmt described in the stata.com link) is also available:
. mata : mata query
Mata settings
set matastrict off
set matalnum off
set mataoptimize on
set matafavor space may be space or speed
set matacache 400 kilobytes
set matalibs lmatabase;lmataado;lmataopt;l_cfrmt
set matamofirst off
But when I search for the usage of the library l_cfrmt (which is necessary for outreg) there occur the following error-message:
. mata : mata desc using l_cfrmt
c:\ado\plus\l\l_cfrmt.mlib from a more recent version of Stata
It looks, if the version, which I loaded via ssc is not compatible with the Version 10.1 of Stata.
Does somebody have any idea how to solve this problem? I search for a few hours now, but I did't find any possible solution.
Regards,
Michael
First, the code you found on the ucla website for -outreg- is not correct -- John Gallup has since made many changes to the latest version of -outreg-, one of which affects your example. (ignoring your mata issue for a moment) This code should be modified to this in order to make it run:
**install latest outreg
ssc install outreg, replace
use http://www.ats.ucla.edu/stat/stata/notes/hsb1, clear
regress read write
outreg using test.doc, novarlabel replace
the code above works on my machine with an updated version of Stata 12 MP and updated -outreg- version 4.12.
Regarding the mata error: It might be the case that the newest -outreg- just might not work with Stata 10.1 - but I wouldn't give up yet. First, make sure your Stata is fully updated (-update query- and -update all-).
Second, follow the advice of the author of -outreg- in this Statalist thread:
http://www.stata.com/statalist/archive/2011-07/msg01014.html
Finally, if you do have a missing mata component/library, as that thread hints at, and cannot follow this advice to correct it, then consider re-installing Stata and/or contacting Stata tech support.