I was wondering why ipython nbconvert --to markdown not searching the image in directory first.
If i do this in my tes.ipynb:
from Ipython.display import Image
Image('tes_files/1.jpg')
then if i execute command
ipython nbconvert tes.ipynb --to markdown --stdout
what i will get the output is
from IPython.display import Image
Image('tes_files/1.jpg')
![jpeg](tes_files/tes_0_0.jpeg)
Why nbconvert not searching for the path specified by the path first,
then if it doesn't exist, generate a new one?
I know maybe there's some idea that we can import image from all directory and with nbconvert, markdown just gathering it into one folder.
Is there another option command? Do i have to create a new profile?
UPDATE:
Suppose I have set the url path for the image folders:
IMG_FOLDERS = '../galleries/tes_files'
and set the url path to that directory. At some point, I create a plot.
Then when I execute nbconvert, it just create a new folder 'name'_files, same dir as the ipynb, and create a new image based on the plot inside it. How to tell the nbconvert not to create a new directory, but instead, use IMG_FOLDERS?
Thanks
The issue here, is that the IPython Image class embeds the image data into the notebook if you use it like you did in your example. Embedded images are extracted by a preprocessor from the notebook and finally included in the markdown, latex, etc. document during the conversion.
So what you are looking for is a way to link an image to the notebook, which is still possible with the Image class.
If you check the documentation for the Image class (IPython 2.3) you will find:
Init definition: Image(self, data=None, url=None, filename=None, format=u'png', embed=None, width=None, height=None, retina=False)
...
Parameters
----------
data : unicode, str or bytes
The raw image data or a URL or filename to load the data from.
This always results in embedded image data.
url : unicode
A URL to download the data from. If you specify `url=`,
the image data will not be embedded unless you also specify `embed=True`.
filename : unicode
Path to a local file to load the data from.
Images from a file are always embedded.
Hence, to get the image not embedded but rather linked, you have to use the url argument like:
Image(url='tes_files/1.jpg')
There, is also an embed argument but this doesn't seem to work with the filename argument.
Related
I am running a script to extract the EXIF data from a list of images in a folder that are imported from an iPhone using Python's pillow:
from PIL import Image
image = Image.open(path)
But before anything, some of the pictures need to be converted from iOS' format .HEIC to .jpg. I successfully managed to do so but when I try to open the image that was converted I get the following:
PIL.UnidentifiedImageError: cannot identify image file '.../pictures/IMG_4294.jpg'
See this image comparing the info of two files. The one on the left was converted from .HEIC to .jpg and doesn't work. The one on the right is originally a .jpg and works just fine.
Any thoughts on how I can solve this?
I have a lammps_file.data and I need to convert it to Gromacs files (gro and top) to run my simulations.
Does anyone know how to do this?
Another choice is to convert from lammps to charmm files (psf and pdb). Once I get the charmm files I can just use Topotools to get the gromacs files I need.
Thanks
Indeed, NOW I am trying to do the same myself.
So far, you can use intermol , this should work fine to convert LAMMPS data files to Gromacs files. Once you install intermol, and you ceate a path to the intermol converter, you can use a command like:
python2.7 $conv/convert.py --lmp_in topology.data --gromacs -v
CHECK the format of your data file, I still having problemst to convert it.
If you wish to create the psf file,
you would need VMD (google it), then open the tcl terminal and write :
topo readlammpsdata topology.data full
animate write psf topology.psf
The 1st line is for loading yur LAMMPS data file, if you are in the folder where
that files is located
2nd convert the data to psf CHARMM
Also, you could try this. In this paper, they provide a tood to conver
CHARMM topologies to gromacs here. Thus, you convert to psf, then to gro top.
I am using a Batch script to merge different PDFs in one complete file.
%gsc% -dBATCH -sDEVICE=pdfwrite -sPAPERSIZE=letter -dEPSFitPage -o %dsk%%zus%%ext% %mfd% %pth%tmp\pdfmarks
%dsk%%zus%%ext%: Path and name of final (complete) document
%mfd%: Path and name of docs to be merged (c:\test\1.pdf c:\test\2.pdf ...)
%pth%tmp = path to the pdfmarks file
Additionally, I am creating a pdfmark document inside the script which gs uses to create the bookmarks. But unfortunately, some of the docs I am merging, have already their own bookmarks and I did not yet find a solution how to ignore those. GS should only use the bookmarks inside the pdfmarks file.
How can this be done?
Firstly; you are not 'merging' PDF files when you use Ghotscript's pdfwrite device. The process is described in detail here
The important point is that the way the input file(s) are constructed has no bearing on the way the output file is constructed. If any other software you use relies on the file being constructed in a particular fashion it may not work on the output PDF file.
The -dEPSFitPage switch only has any effect when the input is an EPS file. If you want to 'fit' PostScript or PDF files then you need to use -dPDFFitPage, -dPSFitPage or just -dFitPage. However, all of these rely on you first selecting a media size, and then preventing it being altered by setting -dFIXEDMEDIA. For EPS files you would more normally use -dEPSCrop which sets the media size to the EPS declared BoundingBox.
You can prevent the PDF interpreter reading the Outlines tree (which you are calling Bookmarks) and then creating a pdfmark from it to pass to the pdfwrite device by using the -dNO_PDFMARK_OUTLINES switch which oddly isn't documented, presumably an oversight.
I am trying to convert a scanned page to text using both pytesseract and tesseract command line on Ubuntu. The results are remarkably different (pytesseract performs way better than tesseract command line) and I am unable to understand why. I looked at the default values for the parameters and tried altering some of the parameter values in tesseract command line (like psm ) but I am unable to get the same result as pytesseract. Due to lack of proper documentation in pytesseract I am not able to figure out what default values for parameters are used.
Here is my pytesseract code
print(pytesseract.image_to_string(Image.open('test.tiff'))
Looking at the source code of pytesseract, it seems the image is always converted into a .bmp file.
Working with a .bmp file and psm of 6 at the command line with Tesseract gives same result as pytesseract.
Also, tesseract can work with uncompressed bmp files only. Hence, if ImageMagick is used to convert .pdf to .bmp, the following will work
convert -density 300 -quality 100 mypdf.pdf BMP3:mypdf.bmp
tesseract mypdf.bmp -psm 6 mypdf txt
In tessaract v5 3.0+
Pytessaract does not convert images to BMP. You can verify this by commenting out cleanup(f.name) in the save context manager, which is found within the source code /pytesseract/pytesseract.py. The filename of the temp file will also need to retrieved (Pytessaract was saving files within temp files directory of the user, ie. "[path-to-user]\AppData\Local[file-name]". I found what Pytesseract is actually doing is in the prepare function.
Basically, taking the temp file and using that same file with the tesseract command directly will yeild the same results
tl;dr question:
What is the actual algorithm doxygen uses to find images referenced to in doxygen comments? And the corollary, what's considered best practice which won't break in future doxygen versions?
Details:
We're trying to institute a policy where any images associated with doxygen comments should be localized to the reference, which means we'll have images distributed throughout the source tree. Obviously, we need to make sure that we refer to the images appropriately and that doxygen can find them to produce the correct documentation.
The doxygen documentation states:
doxygen will look for files in the paths (or files) that you specified after the IMAGE_PATH tag
However, in my tinkering I've come to the conclusion that this doesn't seem strictly correct. Here are some experimental results:
================================================
Experiment
File system configuration:
/full/
path/
doxygen.cfg
to/
this/
header.h
images/
image.png
other/
images/
image.png
The doxygen config file is in the "root" of the tree (i.e., /full/path/) and doxygen is executed from this same folder.header.h references images/image.png located in the same tree (/full/path/to/this). There is an identically named image file located elsewhere in the tree. header.h has the line:
#file html [filename]
reference where [filename] is one of the following:
image.png
images/image.png
./images/image.png
/full/path/to/this/images/image.png
Then I play with the IMAGE_PATH variable.
Case 1:IMAGE_PATH = (i.e., no path defined).
"Wrong" image loaded (other/iamges/image.png)
no image
no image
Correct image loaded
Case 2: IMAGE_PATH = /full/path (path provided to root, but not full path to header file).
Correct image loaded
Correct image loaded
Correct image loaded
Correct image loaded
Case 3: IMAGE_PATH = /full/path/other (path provided to root which does not include the header file).
"Wrong" image loaded (other/iamges/image.png)
"Wrong" image loaded (other/iamges/image.png)
"Wrong" image loaded (other/iamges/image.png)
Correct image loaded
================================================
Inferred Algorithm Properties
Relative paths only work if the relative path is in the tree rooted in a path specified in IMAGE_PATH.
In the case where the image file name can resolve into different images, doxygen appears to pick the one "closest" to the reference.
First thing first... thanks for posting this, I was starting to think I was missing something obvious. Now I know there are at least two of us...
I was attempting to include a picture in a markdown file; this may explain the slightly different results I got. Also, I tested with \image command only. At first I only got a long series of "image not found" warnings, but eventually I reached some consistent positive results suggesting that:
The image is only found when IMAGE_PATH setting points directly to the folder where the image is (no parent folder). The manual slightly hints towards this by suggesting that IMAGE_PATH may ba a collection of paths or files
IMAGE_PATH can be expressed as a full path or relative to the location where DOXYGEN is run
Moreover, in order for the image to be "found", the file name and path should match a part of the actual full name and path of the image
For instance, given a markdown page and an image in the following folders:
/some-path/work/my-page.md
/some-path/work/images/some/more/folders/the-image.png
In order to copy the page while running DOXYGEN in the "work" folder, IMAGE_PATH should be set as one of the following:
/some-path/work/images/some/more/folders/the-image.png
/some-path/work/images/some/more/folders
images/some/more/folders/the-image.png
images/some/more/folders
In all cases, the image can be successfully referenced in the markdown page as either "the-image.png" or "folders/the-image.png", "more/folders/the-image.png" etc. The criteria is the reference matching part of the actual file path and name (while one could expect the image reference to be relative to the markdown file it appears in - this seems wrong).
I say again, these tests were conducted using a markdown file, and it's possible in this case the mechanics be different from the one applicable to images referenced in source files.
In html documentation, images were replaced with the broken image icon.
After two days of trial and error, I found out that the reason was in the set text direction (Project->OUTPUT_TEXT_DIRECTION), it should be None, not LTR. The documentation was carried out in two ways:
![picture](my_picture.png)
#image html my_picture.png