Command-line tool for converting TTF/OTF fonts to SVG - command-line

Does anyone know of a command-line tool that will convert both TTF and OTF fonts to SVG fonts?

You can use fontforge or batik to do this from the commandline.
With fontforge (see scripting documentation):
#!/usr/bin/fontforge
Open($1)
Generate($1:r + ".svg")
Save the above to convert2svgfont.pe file, then invoke as:
convert2svgfont.pe myfont.ttf
For batik see this documentation, install and then invoke as:
java -jar batik-ttf2svg.jar myfont.ttf -o myfont.svg

The fontforge recipe given previously by #Erik no longer works - fontforge has switched to Python scripting.
Here's how I converted a font from PFA to SVG on the command line - this will also work fine for TTF, etc.:
fontforge -c 'import fontforge;fontforge.open("/usr/share/fonts/X11/Type1/NachlieliCLM-Bold.pfa").generate("NachlieliCLM-Bold.svg")'

The batik part of this answer is also out of date because batik gives you an svg output using the deprecated glyph element.
If you run the latest version of batik on the nasa.ttf for example
java -jar batik-ttf2svg-1.10.jar nasa.ttf -o myfont.svg
you get an output that looks something like this
<font horiz-adv-x="1045" ><font-face
font-family="Nasa"
units-per-em="2048"
panose-1="2 11 5 0 0 0 0 0 0 0"
ascent="1507"
descent="-393"
alphabetic="0" />
....followed by much more code representing every glyph in font
the way to deal with this is represented in the answer at Use SVG glyph tag in HTML - turn glyphs into symbols and flip them.
As far as why the fonts are flipped on their X axis refer to the superseded part of spec
https://www.w3.org/TR/SVG11/fonts.html#SVGFontsOverview
Unlike standard graphics in SVG, where the initial coordinate system has the y-axis pointing downward (see The initial coordinate system), the design grid for SVG fonts, along with the initial coordinate system for the glyphs, has the y-axis pointing upward for consistency with accepted industry practice for many popular font formats.

#Jay (I was unable to comment), on Window, this works :
"C:\Program Files (x86)\FontForgeBuilds\fontforge.bat" -lang py -c "import fontforge; fontforge.open('C:/Temp/myFont.ttf').generate('C:/Temp/myFont.svg')"

Related

How can a MATLAB program test whether MATLAB can render a particular font?

I would like to use some special characters in a MATLAB figure. How can my program ensure the fonts are available before using them?
listfonts() is not reliable. It claims "Zapf Dingbats" is available on my machine, but it is not (and text() renders using a default font instead). listfonts() always includes the standard PostScript fonts. I suppose that's because they are always available for PostScript output, but I'm interested in displayed figures. Likewise uisetfont() and MATLAB -> Preferences -> Fonts -> Custom list "Zapf Dingbats", but render the sample using a default font.
Just looking for the font file doesn't work, either. For example, "Webdings" works fine on my main machine. However, on a second machine, "Webdings" is installed (there's a file /Library/Fonts/Webdings.ttf, and Word can use it), but MATLAB substitutes a default font.
I thought of one test: Create a small figure with one marker symbol, use print() to write it to a .png file, read in the file as data, compute a hash, and compare that hash with a stored value. Is there a less clumsy method?
I found Unicode equivalents for most of the symbols I need, that work for both of my test machines. However, they too apparently depend on my having the right fonts installed. For example, there are many Unicode versions of a square. Hex codes 2588, 25a0, and 25fc work here, but 25fe, 2b1b, and 2bc0 are rendered as blank. Is there a way to tell whether these characters are available?
I'm running R2017b under macOS version 10.13.5, and "set | grep LANG" displays "LANG=en_US.UTF-8".

ghostscript not creating exact images

I am running below script to create images from postscript file, the images are coming but on first page watermark is not there.
gs -dUseCIEColor -dNOPAUSE -sDEVICE=jpeg -dFirstPage="1" -dLastPage=2 -sOutputFile=outputImage_%0d_A.gif -dJPEGQ=100 -r300 -q inputFile.ps -c quit;'
I am giving the link of ps file which i am using.
http://speedy.sh/Y7vWj/inputFile.ps
Can anybody please help!!!!
Thanks in advance...
OK you haven't stated what version of Ghostscript you are using, nor have you been very clear about what is missing. By 'watermak' do you mean the dark grey text 'PAULDAVIS' written diagonally across the very dark grey rectangle ?
If so then I can see that using the current version of Ghostscript and your command line, its not missing
A few observations on your command line:
-dUseCIEColor - Don't use this unless you know exactly what you are doing and why you want this, I'm guessing you don't (because you have not set any Color Rendering Dictionary). With this you get very dark grey text which is nearly invisible against the very dark grey rectangle. Not surprising since this relates to colour management.
You've set the device to jpeg, but you've set the output file to have a .gif extension.
You are using -dFirstPage and -dLastPage which have no effect when the input is not PDF (though this is added as a new feature in unreleased code).
You've set FirstPage=1 and LastPage=2 on a 2 page file.....
You have set -dFirstPage="1", which isn't going to work for any code which parses and uses it. The quotes won't work.
I'd recommend you do not set -q or -dQUIET when trying to diagnose problems, telling Ghostscript to be quiet will potentially mean you miss useful information.
-c quit; -c means 'process the next part of the command line as PostScript'. But quit; isn't valid PostScript (the semicolon should not be present) and will throw a PostScript error. If you want GS to exit after processing, consider simply using -dBATCH.

How to make fonts available to the LaTeX interpreter in Matlab R2013a?

It is possible to embed LaTeX-formatted text and equations into Matlab plots by setting the text property 'Interpreter' to the value 'latex', e.g.
text(0.1, 0.5, 'Einstein: $E = m c^2$', ...
'Interpreter', 'latex', 'FontSize', 32)
These equations appear on screen as well as in illustrations exported to eps files.
Through the appropriate LaTeX commands, it is also possible to change the font from the default Computer Modern Serif to e.g. Computer Modern Typewriter
text(0.1, 0.5, '\fontfamily{cmtt}\selectfont Einstein: $E = m c^2$', ...
'Interpreter', 'latex', 'FontSize', 32)
My question is: Is it possible to insert additional fonts into the Matlab installation, such that these fonts become available for use with 'Interpreter' 'latex', for rendering on screen as well as producing eps files? And if yes, how?
Background
(All paths relative to the Matlab installation, /opt/MATLAB/R2013a on my Linux system.)
Matlab includes a customized version of the (La)TeX interpreter. It is called via a frontend m-file called tex.m in toolbox/matlab/graphics which takes LaTeX code as an argument and returns dvi data within its output argument. The customized LaTeX installation is found in sys/tex and includes TeX font metric files under sys/tex/tfm.
I do not have any information on the parts of Matlab that render this dvi. However, font data for rendering are found under sys/fonts/ttf and sys/fonts/type1.
Making additional fonts usable therefore consists of two parts: Making it available for the LaTeX interpreter, and making it available for the rendering function. The first part can be tackled by manipulating tex.m, such that it generates the dvi through an independent regular installation of LaTeX, and installing the font to this LaTeX in the usual way (e.g. font packages). See undocumentedmatlab.
The second part of the question is therefore the crucial one: How to insert additional fonts into sys/fonts/ttf and sys/fonts/type1 such that they become usable by the dvi renderer component of Matlab.
Concrete case
I tried to concretely solve the second problem for a special case: The Computer Modern Sans font is included in the Matlab-LaTeX installation through tex/tfm/cmss10.tfm, but the corresponding ttf and pfb-files are missing from sys/fonts such that it does not get rendered.
Matlab's collection of ttf-files does not appear to have some kind of inventory. I therefore simply copied the file cmss10.ttf from an installation of matplotlib to sys/fonts/ttf/cm/mwa_cmss10.ttf, following the file and folder naming conventions of the other files present. This procedure was reported to be working on Alec's Web Log for Matlab 2011b on Max OS X, but on my system it has no effect, neither for screen display nor eps export.
Matlab's collection of type1 fonts has a complex inventory, distributed over files fonts.dir, fonts.scale, encodings.dir and a folder encodings full of enc-files. Again I found cmss10.pfb, this time from a TeXlive installation, renamed and copied it, and made entries in the inventory files following the example of the other fonts listed. Again, this procedure has no effect at all.
Does anyone know more about how Matlab uses ttf and pfb-files, and can give me a hint on how to make the cmss10-files accessible to Matlab rendering? Or does anyone have a suggestion how to debug this and find out more about the inner workings of Matlab's LaTeX support?
I invested hours of further research into my question, and came up with some interesting new insights, but no real solution. Still, I'm posting my results here in order for others who might investigate this to start from. I post it as an "answer" not make my already long question even longer.
Comparison between Matlab's old (R2010a) and current (R2013a) tex and fonts infrastructure
For the standard font Computer Modern Roman, the old infrastructure contains
sys/tex/tfm/cmr10.tfm
sys/fonts/ttf/cm/cmr10.ttf
sys/fonts/type1/cm/cmr10.pfb
sys/fonts/type1/cm/cmr10.pfm
and the current
sys/tex/tfm/cmr10.tfm
sys/fonts/ttf/cm/mwa_cmr10.ttf
sys/fonts/ttf/cm/mwb_cmr10.ttf
sys/fonts/type1/cm/mwa_cmr10.pfb
sys/fonts/type1/cm/mwb_cmr10.pfb
The TeX font metric files are identical. The truetype and type1 files appear to contain the same glyph data, but have been split into files containing latin (mwa) and greek characters (mwb). The pfm file has simply disappeared.The old type1 files have a copyright notice 1997 by the AMS, the new ones 2011 by the MW.
This indicates that in order to make Computer Modern Sans from an old Matlab work in current Matlab, it might be sufficient to copy cmss10.ttf and cmss10.pfb to mwa_cmss10.ttf and mwa_cmss10.pfb, since the tfm file is still present (see question).
Which files are used in R2013a?
The additional dir and enc files in sys/fonts/type1 appear not to be used, because deleting them leaves screen rendering and eps generation fully functional.
I suspected that the ttf files are used for screen rendering and the pfb files for inclusion in generated eps files. The former appears not to be the case, because deleting all ttf files leaves screen rendering and eps generation fully functional, too. Matlab does complain, however, if the folder sys/fonts/ttf/cm does not exist!
This indicates that a) it's not necessary to bother with modifying the dir and enc files, and b) it's not necessary to copy the ttf file.
Is inserting new pfb files enough?
After cmss10.pfb from an old Matlab is copied to sys/fonts/type1/cm/mwa_cmss10.pfb, using Computer Modern Sans in an equation still makes Matlab warn that "cmss10 is not supported", and the screen rendering is not correct. Moreover, a generated eps file does not render correctly.
However, the generated eps file does include the contents of mwa_cmss10.pfb and the reason it doesn't work is that the included pfb file defines a font named "CMSS10", while the eps refers to a font named "mwa_cmss10". Instead of #Daniel E. Shub's solution to change the references in the eps, one can edit the file mwa_cmss10.pfb and change its \FontName to "mwa_cmss10". This might be done with a simple text editor applied to the pfb. However, the better way is to disassemble the pfb file to PostScript using t1disasm, change the PostScript, and then reassemble using t1asm. These tools are contained in the t1utils package on CTAN.
The resulting eps does still not work properly though: Characters are not correctly positioned, especially for larger font sizes.
This indicates that the presence of the pfb file alone does not provide Matlab with the correct font metrics, and that the dvi file generated by Matlab's LaTeX does not explicitly position characters but relies on the renderer having those metrics.
See tex.se for a question concerning a workaround for the second point.
Does "hacking" existing fonts work?
Daniel E. Shub proposed in his answer not to add fonts, but to overwrite those existing in the Matlab installation. There are two problems with this:
– The correct font metrics are still not available to Matlab. Overwriting a font therefore only works, and only approximately, if the metrics of the original font and those of the new one are similar.
Example:
– Screen rendering only works in some cases. For me, overwriting mwa_cmr10 with a patched cmss10 and using \rm did lead to Computer Modern Sans being rendered to screen and in the eps file, albeit with slightly wrong positioning. However, overwriting mwa_cmtt10 and using \tt did not lead to Computer Modern Sans being rendered on screen; instead, Computer Modern Typewriter was rendered.
This implies a) that there is another independent source of font metrics for Matlab's renderer. As far as I can tell, they come from none of the files under sys/tex or sys/fonts. b) Font outlines are only in some cases read from the pfb files in sys/fonts/type1/cm.
Conclusion
The inner workings of the dvi renderer in recent Matlab therefore remain mysterious. Possible candidates where the missing information may be hidden are toolbox/matlab/graphics/hardcopy.p and / or com/mathworks/hg/uij/TextRasterizer.class in java/jar/hg.jar.
I'll cease my investigations for the time being (and going to have a look at psfrag ;)
I made the comment on Undocumented Matlab that you refer to. Apparently, I grossly underestimated the difficulty of making the Matlab DVI viewer work with fonts. I have included a non-working solution in the hope that someone can understand the warning it generates. I also have a working solution that is a pretty big hack. I am using Matlab R2013a and TexLive 2013 on Linux. I am not sure what will happen on Mac or Windows.
Non working solution
My first approach was to overload the Matlab tex.m function so I can easily do things in LaTeX and only have to worry about the dvi file
function [dviout,errout,auxout] = tex(varargin)
fid = fopen('matlab.dvi');
dviout = fread(fid, 'uint8');
dviout = uint8(dviout);
fclose(fid);
errout = [];
auxout = [];
end
I then created matlab.dvi by processing
\documentclass{article}
\setlength\topmargin{-0.5in}
\setlength\oddsidemargin{0in}
\DeclareFontFamily{T1}{myfont}{}
\DeclareFontShape{T1}{myfont}{m}{n}{<-> [1.2] AuriocusKalligraphicus}{}
\begin{document}%
\setbox0=\hbox{\usefont{T1}{myfont}{m}{n}Some text with a distinct font $\alpha$}%
\copy0\special{bounds: \the\wd0 \the\ht0 \the\dp0}%
\end{document}%
I then copied the TexLive font to Matlab
# cp $TEXLIVEROOT/texmf-dist/fonts/type1/public/aurical/AuriocusKalligraphicus.pfb $MATLABROOT/sys/fonts/AuriocusKalligraphicus.pfb
I get the "expected" warnings from
>> text(0.0, 0.5, 'DOES NOT MATTER', 'Interpreter', 'LaTeX', 'FontSize', 20)
Warning: Font AuriocusKalligraphicus10 is not supported.
Warning: Font AuriocusKalligraphicus10 is not supported.
If I try and export the figure (with the missing fonts) to a pdf file via alt+f alt+r I get a whole bunch of warnings including the potentially useful
Warning: Missing
/usr/local/matlab/R2013a/sys/fonts/type1/cm/mwa_auriocuskalligraphicus10.pfb
Working hack solutiuon
After becoming feed up with not knowing what to call the pfb files, I decided to overwrite one that already works (cmr10).
At the CLI
# cp $MATLABROOT/sys/fonts/mwa_cmr10.pfb $MATLABROOT/sys/fonts/mwa_cmr10.pfb.bak
# cp $TEXLIVEROOT/texmf-dist/fonts/type1/public/aurical/AuriocusKalligraphicus.pfb $MATLABROOT/sys/fonts/mwa_cmr10.pfb
and at the Matlab prompt
>> text(0.0, 0.5, 'Some text with a distinct font $\alpha$', 'Interpreter', 'LaTeX', 'FontSize', 20)
gives me
.
In order to export the figure to an eps with the fonts you need to replace all the instances of /mwa_cmr10 with /AuriocusKalligraphicus in the eps file. Presumably this is because this solution is a hack. Ideally I should not only replace the pfb file, but also the fd and tfm files. There are probably enough pfb fonts available to allow you to create most figures.
This is a very crude solution, but you may edit the resulting .eps file using a text editor and get the desired fonts. For example you can replace following:
%%IncludeResource: font mwa_cmr10 /mwa_cmr10 /WindowsLatin1Encoding
120 FMSR
with following:
%%IncludeResource: font Helvetica /Helvetica /WindowsLatin1Encoding
120 FMSR
You may even write a simple script which would open the resulting .eps file and replace any font with anyone you desire. I hope this helps!

How can I import .eps in Mathematica without losing the text?

I am trying to import a histogram produced by Stata as an .eps file into Mathematica, but it does not display axes' labels. That is, for some reason, Mathematica does not import .eps as but rather transforms it.
How can I avoid that? As of now, I am using plain
Import["~/hst.eps"]
I had a similar problem with LateX and a pdf graph recently, which also manifested itself with the eps version. I wound up modifying the user-written graphexportpdf and things seemed to work out. Perhaps you will find this solution helpful.
Mathematica currently (v.9) usually cannot import textual elements from .eps files properly. One possible solution would be to export your plot as .pdf rather than .eps from you plotting software and then Import generated .pdf. If you need to import the text as text you can turn off outlining by using the "TextOutlines" -> False option.
If exporting to .pdf from your plotting software is not possible you can convert .eps to .pdf by another program and then Import generated .pdf as above.
For your file orders_weekly.eps using the gsEPS2PDFEmbedFonts function and then Importing generated .pdf, I get (Mathematica 8.0.4):
gsEPS2PDFEmbedFonts["orders_weekly.eps", "orders_weekly.pdf"]
graphics=First#Import["orders_weekly.pdf"(*,"TextOutlines"->False*)];
Show[graphics, Frame -> True, PlotRange -> All, ImageSize -> Automatic]
I have commented the "TextOutlines" -> False option because Mathematica 8.0.4 still imports rotated text incorrectly with it. I haven't tested it with v.9 though.
Another possibility is to convert all the glypth in the .eps file to oultines. Based on Jens Nöckel's code,
gsEPS2outlinedEPS[epsPath_String, outlinedEPSPath_String] :=
Run["gswin64c.exe -sDEVICE=epswrite -dNOCACHE -sOutputFile=\"" <>
outlinedEPSPath <> "\" -q -dbatch -dNOPAUSE \"" <> epsPath <>
"\" -c quit"]
Here gswin64c.exe is the name of GhostScript executable for 64bit Windows systems, for Linux replace it with gs.
Note that you should have GhostScript installed and configured (for Windows you should add gs\bin and gs\lib to the PATH, where gs is the top-level Ghostscript directory)

How to find parameters supported in Tesseract OCR config file

I want to know what parameters the config file used by Tesseract OCR accepts, how to write a config file, etc.
I can't find any documentation about this on their site. How can I determine what parameters are supported, and what they mean?
I found these instructions in the link below. They are about writing the config file and where to place it:
config file is simple text file without BOM and with Unix end-of-line mark (on Windows you can use some advanced text editor e.g. Notepad++ to achieve this).
If you use tesseract executable this is only way how to change tesseract parameters.
config file should be located in your tessdata/configs directory. Have a look there for some examples.
There is a list of all the variables plus descriptions of each one in http://www.sk-spell.sk.cx/tesseract-ocr-parameters-in-302-version. Note it's for Tesseract 3.02, things may be different in other versions.
Edit: Also adding a pastebin link in case the above link becomes dead.
Tesseract v3.04 now offers the command line option --print-parameters, so you can call tesseract --print-parameters to get a list of the 678 (!) configurable parameters, their default values, and a short description:
Tesseract parameters:
editor_image_xpos 590 Editor image X Pos
editor_image_ypos 10 Editor image Y Pos
editor_image_menuheight 50 Add to image height for menu bar
editor_image_word_bb_color 7 Word bounding box colour
editor_image_blob_bb_color 4 Blob bounding box colour
editor_image_text_color 2 Correct text colour
...and many, many more
It's just a plain text file containing space-delimited key/value pairs for Tesseract config variables, each on separate line; for instance:
interactive_display_mode T
tessedit_display_outwords T
There are several standard config files -- such as digits, hocr -- under Tesseract tessdata/configs folder.