How to Export Graphics in MATLAB with exportdlg.m to SVG with embedding fonts? - matlab

Hello last_hope_community, ;)
I have tried several options of exporting, the standard export. Works well to SVG.
load patients
figure
tbl = table(LastName,Age,Gender,SelfAssessedHealthStatus,...
Smoker,Weight,Location);
h = heatmap(tbl,'Smoker','SelfAssessedHealthStatus');
saveas(gcf,["test_heatp_save_svg.svg"],'svg') %no export style-> fonts gets embedded
%print(gcf,["test_heatp_print_pdf"],'-dpdf','-bestfit','-r0')
Saveas and print, but here is point 1.
For setprinttemplate one can not adopt the styles from the exportsetup (hgexport and so on). Print has the very nicy opportunity of -bestfit which one can use with print to pdf(with print fonts are embedded automatically) and then to import in Inkscape to use as SVG. But when styles are adopted, by export manually to save it then as SVG within the exportdlg or with saveas, all fonts gets vectorpaths.
Is there a way to apply export styles automatically and then get an SVG File in which the fonts are embedded in SVG like in the upper code?
I have already checked if it is just a heatmap issue but it is not. Same with the following code, without export_style everything is great. But with export style everything gets to vector paths.
figure
plot(1:10000)
title('Hallo')
saveas(gcf,["test_simple_save_svg.svg"],'svg')
%print(gcf,["test_simple_print_pdf"],'-dpdf','-bestfit','-r0')
Best would be if anyone knows a way to apply exportstyles programatically and to then save as svg with embedded font, or apply exportstyles programmatically and then print to pdf with -bestfit,export as pdf would just embed the titles and not the TickLabels.
Additional print with bestfit is not always good because the result from the figure window varies from time to time a lot.
Thanks for help, it's annoying that export as svg makes paths from text that one cannot customize with other programms.

Related

Perl PDF::API2 -- trouble incorporating watermark/letterhead as background image

Having trouble diagnosing an issue with PDF::API2, where I have an existing PDF to bring in as a background/watermark/letterhead, and then I programmatically overlay text on top of that.
This is the code that brings in the background, before any text is overlaid on top:
my $xo = $pdf->importPageIntoForm( $bg_pdf, 1 );
my $gfx = $page->gfx();
$gfx->formimage( $xo, 0, 0, 1.0 );
This code runs fine, no errors -- and the $bg_pdf is a simple PDF v1.4 saved straight from Adobe Illustrator. But after some text is added by the script, the output is partially corrupted. That is, the imported letterhead PDF shows up to a point and then the rest is skipped/ignored until the overlaid text displays. For example, the letterhead includes several rectangles, and only a scant few show up in the generated PDF.
So it appears that upon display, the data for the existing background-pdf gets corrupted somehow, but the programmatic text overlaid on top, is just fine.
Early on I learned to turn off $pdf->{forcecompress} = 0; to keep the whole resulting PDF from being weirdly and randomly scrambled. So that shouldn't be pertinent here.
I've tried exporting the Illustrator PDF output in a number of different formats (v1.3, 1.4, 1.5); with or without 'preserve illustrator editing capabilities'; compression turned on or off; with including embedded fonts like garamond, vs using helvetica only, vs converting all text to outlines...
How does one go about troubleshooting this? Using PDF::API2 v2.038, in perl 5.30.0

itext PDFWrite merge the PDF layers, Hence generated PDF also show the hidden layer as spot

Using iText PDFWrite class we render the inputFile into outputFile. As inputFile has multilayer, due to that the output PDF (outputFile) has outline of its internal layers. Actually PDFWrite merge the pdf layers while rendering it, Here we want to avoid it. We want to render the visible layers/top layers only. We use PDFWrite instead of PDFCopy because we do all matrics operations(move, rotate, scale ..etc) on inputFile.
Files:
Layered Image
Input file
Output file
A bit late but as far as I understand the question, you want to ensure that only visible layers are rendered during processing because your file has some invisible or hidden layers which re-appear. Since iText support will not help you, there is another (free/GPL License) way.
The first step is to 'burn in' the state of the layers and remove the optional content while honoring the visibility state, using another processor. Once that step is done you can continue with whatever version of iText you want.
There are 2 processing options that can 'burn in' the visibility state of Optional Content Groups (= Layers = OCG)
Option 1
Use www.ghostscript.com with -sDEVICE=pdfwrite command line. See GS documentation for full command line
Option 2
Use Poppler's pdftocairo.exe with -pdf command line.
Both will create a PDF that has no interactive layers, with OCG removed. All iText operations will work as normal on that file.

How to make fonts available to the LaTeX interpreter in Matlab R2013a?

It is possible to embed LaTeX-formatted text and equations into Matlab plots by setting the text property 'Interpreter' to the value 'latex', e.g.
text(0.1, 0.5, 'Einstein: $E = m c^2$', ...
'Interpreter', 'latex', 'FontSize', 32)
These equations appear on screen as well as in illustrations exported to eps files.
Through the appropriate LaTeX commands, it is also possible to change the font from the default Computer Modern Serif to e.g. Computer Modern Typewriter
text(0.1, 0.5, '\fontfamily{cmtt}\selectfont Einstein: $E = m c^2$', ...
'Interpreter', 'latex', 'FontSize', 32)
My question is: Is it possible to insert additional fonts into the Matlab installation, such that these fonts become available for use with 'Interpreter' 'latex', for rendering on screen as well as producing eps files? And if yes, how?
Background
(All paths relative to the Matlab installation, /opt/MATLAB/R2013a on my Linux system.)
Matlab includes a customized version of the (La)TeX interpreter. It is called via a frontend m-file called tex.m in toolbox/matlab/graphics which takes LaTeX code as an argument and returns dvi data within its output argument. The customized LaTeX installation is found in sys/tex and includes TeX font metric files under sys/tex/tfm.
I do not have any information on the parts of Matlab that render this dvi. However, font data for rendering are found under sys/fonts/ttf and sys/fonts/type1.
Making additional fonts usable therefore consists of two parts: Making it available for the LaTeX interpreter, and making it available for the rendering function. The first part can be tackled by manipulating tex.m, such that it generates the dvi through an independent regular installation of LaTeX, and installing the font to this LaTeX in the usual way (e.g. font packages). See undocumentedmatlab.
The second part of the question is therefore the crucial one: How to insert additional fonts into sys/fonts/ttf and sys/fonts/type1 such that they become usable by the dvi renderer component of Matlab.
Concrete case
I tried to concretely solve the second problem for a special case: The Computer Modern Sans font is included in the Matlab-LaTeX installation through tex/tfm/cmss10.tfm, but the corresponding ttf and pfb-files are missing from sys/fonts such that it does not get rendered.
Matlab's collection of ttf-files does not appear to have some kind of inventory. I therefore simply copied the file cmss10.ttf from an installation of matplotlib to sys/fonts/ttf/cm/mwa_cmss10.ttf, following the file and folder naming conventions of the other files present. This procedure was reported to be working on Alec's Web Log for Matlab 2011b on Max OS X, but on my system it has no effect, neither for screen display nor eps export.
Matlab's collection of type1 fonts has a complex inventory, distributed over files fonts.dir, fonts.scale, encodings.dir and a folder encodings full of enc-files. Again I found cmss10.pfb, this time from a TeXlive installation, renamed and copied it, and made entries in the inventory files following the example of the other fonts listed. Again, this procedure has no effect at all.
Does anyone know more about how Matlab uses ttf and pfb-files, and can give me a hint on how to make the cmss10-files accessible to Matlab rendering? Or does anyone have a suggestion how to debug this and find out more about the inner workings of Matlab's LaTeX support?
I invested hours of further research into my question, and came up with some interesting new insights, but no real solution. Still, I'm posting my results here in order for others who might investigate this to start from. I post it as an "answer" not make my already long question even longer.
Comparison between Matlab's old (R2010a) and current (R2013a) tex and fonts infrastructure
For the standard font Computer Modern Roman, the old infrastructure contains
sys/tex/tfm/cmr10.tfm
sys/fonts/ttf/cm/cmr10.ttf
sys/fonts/type1/cm/cmr10.pfb
sys/fonts/type1/cm/cmr10.pfm
and the current
sys/tex/tfm/cmr10.tfm
sys/fonts/ttf/cm/mwa_cmr10.ttf
sys/fonts/ttf/cm/mwb_cmr10.ttf
sys/fonts/type1/cm/mwa_cmr10.pfb
sys/fonts/type1/cm/mwb_cmr10.pfb
The TeX font metric files are identical. The truetype and type1 files appear to contain the same glyph data, but have been split into files containing latin (mwa) and greek characters (mwb). The pfm file has simply disappeared.The old type1 files have a copyright notice 1997 by the AMS, the new ones 2011 by the MW.
This indicates that in order to make Computer Modern Sans from an old Matlab work in current Matlab, it might be sufficient to copy cmss10.ttf and cmss10.pfb to mwa_cmss10.ttf and mwa_cmss10.pfb, since the tfm file is still present (see question).
Which files are used in R2013a?
The additional dir and enc files in sys/fonts/type1 appear not to be used, because deleting them leaves screen rendering and eps generation fully functional.
I suspected that the ttf files are used for screen rendering and the pfb files for inclusion in generated eps files. The former appears not to be the case, because deleting all ttf files leaves screen rendering and eps generation fully functional, too. Matlab does complain, however, if the folder sys/fonts/ttf/cm does not exist!
This indicates that a) it's not necessary to bother with modifying the dir and enc files, and b) it's not necessary to copy the ttf file.
Is inserting new pfb files enough?
After cmss10.pfb from an old Matlab is copied to sys/fonts/type1/cm/mwa_cmss10.pfb, using Computer Modern Sans in an equation still makes Matlab warn that "cmss10 is not supported", and the screen rendering is not correct. Moreover, a generated eps file does not render correctly.
However, the generated eps file does include the contents of mwa_cmss10.pfb and the reason it doesn't work is that the included pfb file defines a font named "CMSS10", while the eps refers to a font named "mwa_cmss10". Instead of #Daniel E. Shub's solution to change the references in the eps, one can edit the file mwa_cmss10.pfb and change its \FontName to "mwa_cmss10". This might be done with a simple text editor applied to the pfb. However, the better way is to disassemble the pfb file to PostScript using t1disasm, change the PostScript, and then reassemble using t1asm. These tools are contained in the t1utils package on CTAN.
The resulting eps does still not work properly though: Characters are not correctly positioned, especially for larger font sizes.
This indicates that the presence of the pfb file alone does not provide Matlab with the correct font metrics, and that the dvi file generated by Matlab's LaTeX does not explicitly position characters but relies on the renderer having those metrics.
See tex.se for a question concerning a workaround for the second point.
Does "hacking" existing fonts work?
Daniel E. Shub proposed in his answer not to add fonts, but to overwrite those existing in the Matlab installation. There are two problems with this:
– The correct font metrics are still not available to Matlab. Overwriting a font therefore only works, and only approximately, if the metrics of the original font and those of the new one are similar.
Example:
– Screen rendering only works in some cases. For me, overwriting mwa_cmr10 with a patched cmss10 and using \rm did lead to Computer Modern Sans being rendered to screen and in the eps file, albeit with slightly wrong positioning. However, overwriting mwa_cmtt10 and using \tt did not lead to Computer Modern Sans being rendered on screen; instead, Computer Modern Typewriter was rendered.
This implies a) that there is another independent source of font metrics for Matlab's renderer. As far as I can tell, they come from none of the files under sys/tex or sys/fonts. b) Font outlines are only in some cases read from the pfb files in sys/fonts/type1/cm.
Conclusion
The inner workings of the dvi renderer in recent Matlab therefore remain mysterious. Possible candidates where the missing information may be hidden are toolbox/matlab/graphics/hardcopy.p and / or com/mathworks/hg/uij/TextRasterizer.class in java/jar/hg.jar.
I'll cease my investigations for the time being (and going to have a look at psfrag ;)
I made the comment on Undocumented Matlab that you refer to. Apparently, I grossly underestimated the difficulty of making the Matlab DVI viewer work with fonts. I have included a non-working solution in the hope that someone can understand the warning it generates. I also have a working solution that is a pretty big hack. I am using Matlab R2013a and TexLive 2013 on Linux. I am not sure what will happen on Mac or Windows.
Non working solution
My first approach was to overload the Matlab tex.m function so I can easily do things in LaTeX and only have to worry about the dvi file
function [dviout,errout,auxout] = tex(varargin)
fid = fopen('matlab.dvi');
dviout = fread(fid, 'uint8');
dviout = uint8(dviout);
fclose(fid);
errout = [];
auxout = [];
end
I then created matlab.dvi by processing
\documentclass{article}
\setlength\topmargin{-0.5in}
\setlength\oddsidemargin{0in}
\DeclareFontFamily{T1}{myfont}{}
\DeclareFontShape{T1}{myfont}{m}{n}{<-> [1.2] AuriocusKalligraphicus}{}
\begin{document}%
\setbox0=\hbox{\usefont{T1}{myfont}{m}{n}Some text with a distinct font $\alpha$}%
\copy0\special{bounds: \the\wd0 \the\ht0 \the\dp0}%
\end{document}%
I then copied the TexLive font to Matlab
# cp $TEXLIVEROOT/texmf-dist/fonts/type1/public/aurical/AuriocusKalligraphicus.pfb $MATLABROOT/sys/fonts/AuriocusKalligraphicus.pfb
I get the "expected" warnings from
>> text(0.0, 0.5, 'DOES NOT MATTER', 'Interpreter', 'LaTeX', 'FontSize', 20)
Warning: Font AuriocusKalligraphicus10 is not supported.
Warning: Font AuriocusKalligraphicus10 is not supported.
If I try and export the figure (with the missing fonts) to a pdf file via alt+f alt+r I get a whole bunch of warnings including the potentially useful
Warning: Missing
/usr/local/matlab/R2013a/sys/fonts/type1/cm/mwa_auriocuskalligraphicus10.pfb
Working hack solutiuon
After becoming feed up with not knowing what to call the pfb files, I decided to overwrite one that already works (cmr10).
At the CLI
# cp $MATLABROOT/sys/fonts/mwa_cmr10.pfb $MATLABROOT/sys/fonts/mwa_cmr10.pfb.bak
# cp $TEXLIVEROOT/texmf-dist/fonts/type1/public/aurical/AuriocusKalligraphicus.pfb $MATLABROOT/sys/fonts/mwa_cmr10.pfb
and at the Matlab prompt
>> text(0.0, 0.5, 'Some text with a distinct font $\alpha$', 'Interpreter', 'LaTeX', 'FontSize', 20)
gives me
.
In order to export the figure to an eps with the fonts you need to replace all the instances of /mwa_cmr10 with /AuriocusKalligraphicus in the eps file. Presumably this is because this solution is a hack. Ideally I should not only replace the pfb file, but also the fd and tfm files. There are probably enough pfb fonts available to allow you to create most figures.
This is a very crude solution, but you may edit the resulting .eps file using a text editor and get the desired fonts. For example you can replace following:
%%IncludeResource: font mwa_cmr10 /mwa_cmr10 /WindowsLatin1Encoding
120 FMSR
with following:
%%IncludeResource: font Helvetica /Helvetica /WindowsLatin1Encoding
120 FMSR
You may even write a simple script which would open the resulting .eps file and replace any font with anyone you desire. I hope this helps!

Is there an option to control output page orientation (using knitr->pander->pandoc->docx)

I am playing with Tal's intro to producing word tables with as little overhead as possible in real world situations. (Please see for reproducible examples there - Thanks, Tal!) In real application, tables are to wide to print them on a portrait-oriented page, but you might not want to split them.
Sorry if I have overlooked this in the pandoc or pander documentation, but how do I control page orientation (portrait/landscape) when writing from R to a Word .docx file?
I maybe should add tat I started using knitr+markdown, and I am not yet familiar with LaTex syntax. But I'm trying to pick up as much as possible while getting my stuff done.
I am pretty sure the docx writer has no section breaks implemented, also as far as I understand --reference-docx allows for customizing styles and not the page layout (but I might also be wrong here), this is from pandocs guide on --reference-docx:
--reference-docx=FILE
Use the specified file as a style reference in producing a docx file.
For best results, the reference docx should be a modified version of a
docx file produced using pandoc. The contents of the reference docx
are ignored, but its stylesheets are used in the new docx. If no
reference docx is specified on the command line, pandoc will look for
a file reference.docx in the user data directory (see --data-dir). If
this is not found either, sensible defaults will be used. The
following styles are used by pandoc: [paragraph] Normal, Title,
Authors, Date, Heading 1, Heading 2, Heading 3, Heading 4, Heading 5,
Block Quote, Definition Term, Definition, Body Text, Table Caption,
Image Caption; [character] Default Paragraph Font, Body Text Char,
Verbatim Char, Footnote Ref, Link.
Which are styles that are saved in the /word/styles.xml component of the docx document.
The page layout on the other hand is saved in the /word/document.xml component in the <w:sectPr> tag, but pandoc's docx writer ignores this part as far as I can tell.
The docx writer builds by default a continuous document, with elements such as headers, paragraphs, simple tables and so on ... much like a html output.
Option #1 (doesn't solve the page orientation problem):
The only page layout option that you can define through styles is the pageBreakBefore which will add a page break before a certain style
Option #2 (seems elegant but hasn't been tested):
Recently the custom writer has been added that allows for a custom lua script, where you should be able to define how certain Pandoc blocks will be written into the output file ... meaning you could potentially define section breaks and page layout for a specific block inserting the sectPr tag into the document. I haven't tried this out but it would be worth investigating. On pandoc github you can check out a sample lua script file for custom html output.
However, this means, you have to have lua installed, learn the language, and it is up to you if you think its worth the time investment.
Optin #3 (a couple of clicks in Word might just do):
As you will probably spend quite some time setting up how to insert sections and what would be the right size, margins, and figuring how to fit the table to such a layout ... I recommend that you use pandoc to put write your document.docx, that you open in Word, and do the layout by hand:
select the table you want on the landscape page
go to Layout > Margins
> select Apply to: Selected text
> choose Page Setup > select Landscape
Now a new section with a landscape orientation should surround your table.
What you would anyway also probably want to do is styling the table and table caption a little (font-size,...), to achieve the best result (all text styling can be already applied with pandoc where --reference-docx comes handy).
Option #4 (in situation when you can just use pdf instead of docx):
As far as I could figure out is that with pandoc does a good job with tables in md -> docx (alignment, style, ... ), in tex -> docx it had some trouble sometimes. However if your option allows for a pdf output latex will be your greatest friend. For example your problem is solved as easily as just using
\usepackage{pdflscape}
and adding this around your table
\begin{landscape}
...
\end{landscape}
This are the options that I could think of so far.
I would always recommend using the pdf format for reports, as you can style it to your liking with latex and the layout will stay the way you want it to be.
However, I also know that for various reasons word documents are still the main way of reviewing manuscripts in many fields ... so i would most likely just go with my suggested option 3, mostly cause it is a lazy and quick solution and because I usually don't have many documents with tons of giant tables with awkward placement and styling.
Good luck ;-)
Based on Taleb's answer here and some officer package functions, I created a little gist that one can use like this:
---
title: "Example"
author: "Dan Chaltiel"
output:
word_document:
pandoc_args:
'--lua-filter=page-break.lua'
---
I'm in portrait
\endLandscape
I'm in landscape
\endPortrait
I'm in portrait again
With page-breaks.lua being the file hosted here: https://gist.github.com/DanChaltiel/e7505e62341093cfdc489265963b6c8f
This is far from perfect (for instance it won't work without the last portrait section), but it is quite useful sometimes.

PDF output from MATLAB and inclusion in LaTeX

I'm printing some figures in MATLAB in PDF form, and can view them fine with the Evince PDF viewer on Fedora 16.
When I try to include them in LaTeX (TeXLive 2011), however, I get an error
!pdfTeX error: /usr/local/texlive/2011/bin/x86_64-linux/pdflatex (file ./caroti
d_amp_mod_log.pdf): xpdf: reading PDF image failed
However, I can take an example PDF image generated in Mathematica and include it just fine, which tells me that the problem is with the PDF's generated by MATLAB and not with PDF's in general.
Might it have something to do with the set(0,'defaultfigurepaperpositionmode','auto')I put in my startup.m file so that pages would auto-fit the images?
EDIT: I just tried using saveas(figure(1), 'filename.pdf') instead of print(figure(1), 'filename.pdf') and it worked fine, but the PaperPositionMode property is ignored. Any way around this?
Finally found the problem. The correct way to print images is to use the print(handle, '-dformat', 'filename') syntax.
So, for PDF's, we need print(figure(1), '-dpdf', 'myfigure'). See MATLAB documentation on graphics file formats for more information.
Using print(figure(1), 'filename.pdf') still produces a valid PDF for viewing, but it can't be included in LaTeX.
You can try using
pdfpages
or
pgf
to include pdf files. However, you need to use pdflatex only, as you are doing right now.