Can a CSS file be used to generate PDF output? - dita

Does DITA-OT support generation of PDF output with CSS customizations? I think it supports PDF generation using Apache FOP.
I generate both HTML and PDF output and want to use CSS.
Thanks...

The DITA Open Toolkit does not come with default support for using CSS to create PDF. But it can be done. Here is general info on a few ways to do it, to give you an idea:
If you have a late-model version of the Oxygen XML editor, you can use the transformation scenario called DITA Map PDF - based on HTML5 & CSS. This is probably the easiest way to go. If you want to have this capability on a server, there is an extra charge. See Oxygen PDF Chemistry for more info: https://www.oxygenxml.com/chemistry-html-to-pdf-converter.html
The XML Rocks DITA OT plugin, which requires a commercial PDF processor, one of these: Antenna House Formatter, PDFReactor, Vivliostyle or Prince. https://github.com/xmlrocks/dita-ot-pdf-css-page
Do it yourself. One way I have done this is to create normal XHTML output from the DITA OT, and then use a PDF processor and CSS to transform the XHTML to PDF. I have used Antenna House, but other commercial PDF processors (see above) can work also. You should make the XHTML all in one file (all DITA topics merged into one file) by adding this attribute to the <map> element: <map chunk="to-content">

Related

What is the difference between IText and Headless chrome

We have a usecase to generate PDF from HTML for both RTL and LTR languages. Can anyone share the differences between headless and Itext to evaluate which is better for us?
It depends on your expectations of the PDF. If you just want an ordinary PDF, then you can choose any tool that converts HTML to PDF.
However, if you want an archivable PDF (PDF/A), an accessible PDF (PDF/UA) or a PDF 2.0 document, then iText 7 + the pdfHTML add-on + the pdfCalligraph add-on is the better choice. I don't know of any other HTML to PDF conversion software that is PDF 2.0-ready, nor do I think many HTML to PDF convertors support PDF/A or PDF/UA. For instance: with an ordinary HTML to PDF convertor you can convert Arabic content to a PDF, but when you try to convert the PDF to Arabic content, you will get a result that is slightly different. With iText 7, you create PDF documents that can be extracted correctly.
See How to convert HTML containing Arabic/Hebrew characters to PDF? for an RTL example. This FAQ entry is part of the HTML to PDF tutorial.
NOTE: I'm the original developer of iText; you should get the point of view of the people developing Headless Chrome too.

Displaying Rich Text (.rtf) in JavaFX

I have a .rtf file that I need to display within a JavaFX GUI.
My research indicates that the JavaFX TextFlow supports rich text through a tree of Node objects. However, I am at a loss on how to get my .rtf file represented as this tree of Nodes.
I feel like there should be an intuitive way to parse the .rtf file into the Node tree, but I just can't seem to find a way to do it!
Parsing RTF and Rendering in a TextFlow
You could parse the rtf and generate a TextFlow representation of it (similar as is done for this markdown editor for markdown markup). I believe this would be a difficult task for you (the RTF 1.9.1 specification is 277 pages long). Describing how to do this would be too long and complicated for a StackOverflow answer (even if I could describe it, which I probably could not).
Converting RTF to a format JavaFX can more easily render
I suggest using a converter (either offline or using an online service) to convert your RTF to another format before trying to render it in JavaFX. If you know the documents in advance you can pre-convert before shipping your application, if you don't then you will have to provide a real-time conversion facility with your application. I won't recommend a particular service, but you can google and do some research on RTF conversion to see if there is one that fits. As a target format you could choose PDF or HTML, or an image (e.g. PNG).
JavaFX will natively display:
Images using an ImageView.
HTML using a WebView.
A 3rd party library can be used to display PDF documents or other formats using JavaFX.

Programmatically convert Doc(x) files to PDF using Microsoft Word

We are developing a Java application that needs to programmatically convert .rtf, .doc and .docx files to PDF files.
Formatting is important to us, so we need the page numbers to be the same between a source file and a target PDF file, and the contents of each page being the same as the original file.
We have tried out open source solutions, such as JODConverter to invoke a LibreOffice of OpenOffice installation, Docx4j and XDocReport. The best formatting was achieved with LibreOffice. However, even in that case, the pages were different (for example, a 87-page .rtf file results in an 80-page PDF file).
So, we think that the ideal way to make the conversion would be to somehow invoke Microsoft Word though our Java application, and make the conversion with it. That would produce PDF files that have the same formatting as the original files.
Is this possible in any of the following ways:
An API that is directly invokeable through Java?
An API that is invokeable through a .Net language and we would use that with something like JACOB?
A 3rd party library that uses a Microsoft Word installation under the hood (something like JODConverter for Word)?
A CLI interface supported by Word (relevant question)?
Something else?

Translation of XHTML page with MathML

We have few XHTML pages with MathML in them. All are generated using Amaya. We have a requirement to translate them to different languages, but Amaya doesn't seem to support Unicode text encoding. Right now we plan to replace the text in XHTML manually.
I would be happy to know of other possible ways of implementing this translation process. Translation should maintain structure of the MathML.
Use XML to create a translation dictionary with Math entities. The translation can be done programmatically using XSLT and Amaya or E4X and AS3.

Programmatically convert DITA to FrameMaker

Is there a toolkit available (paid or otherwise) to help with programmatically converting a DITA document to a FrameMaker one?
I'm attempting to make an application that converts to multiple formats from DITA. I know I can use the DITA Open Toolkit for most of my needs, but I need to be able to create a native FrameMaker document as well.
Programing language doesn't matter, altho I prefer Java as my application will be web based.
Arbortext import-export is industrial strength and very flexible. You could also try MIFtoGo. Conversion is tricky because source documents are rarely consistent. Conversion without cleanup before and after is next to impossible.
DITA-FMx is what you need:
http://leximation.com/dita-fmx/
Using DITA-FMx, you generate a FrameMaker book from your ditamap (and then save the FM book as PDF).
There is a movie on YouTube that shows you how the process goes. Just search for "PDF Publishing with DITA-FMx 1.1" (Stack Overflow does not allow me to post a second URL here yet)
If you like to see an example, just send me a small sample of a ditamap and I'll generate a FrameMaker book for you.
The disclaimer is that if you're converting a lot of documents you'll be better off with a supported well-integrated solution that fully uses FrameMaker's DITA support. If you're looking to do it on the cheap though (and who isn't) you can do this conversion by using straight XSLT and framemaker templates.
First create the framemaker template to handle the appearance of the document, then use XSLT to map your DITA content to the content tags you've used in Framemaker.
You can use the free SAXON xslt interpretor to do the actual conversion.
Here is some of adobe's reference material on authoring new DITA documents:
http://help.adobe.com/en_US/FrameMaker/8.0/05h/dita.html
Info on Framemakers's native XML support is here:
http://www.adobe.com/products/framemaker/pdfs/xml_fm7.pdf
The framemaker manuals also cover the subject quite extensively. Hope that helps
Indeed FM supports loading DITA files (I tried FM10 and newer) but to automate conversion to .FM format you either use the internal scripting mechanism which is still some manual work.
I have found an existing free utility that can do most basic operations like opening a file, 'saving as' another format and closing it.
tool name is DZBatcher
Example DZbatcher batch file:
Open "c:\My Dita Files\Doc1.dita"
SaveAs -d "c:\My FM Files\Doc1.fm"
Close "c:\My FM Files\Doc1.fm"
Open "c:\My Dita Files\Doc2.dita"
SaveAs -d "c:\My FM Files\Doc2.fm"
Close "c:\My FM Files\Doc2.fm"
Exit
Adobe has a framework called RoboHelp which is probably the infrastructure for this, but I didn't dig deeper as this utility did the job perfectly.
In my SW flow, I created this batch file using a python script that scan all the docs in an input directory and added 3 lines per file as seen above.
I used FM2015 for this task.
Bryan, after a decade's experience converting Frame, Word, Interleaf, etc to XML, I'll tell you that Adobe doesn't have it fully covered. The DITA support in FrameMaker works best if you also have the Leximation plug-in or know how to program the Adobe proprietary EDD. You can't do DITA specialization without programming the EDD in FrameMaker.
FrameMaker has excellent support for DITA. You can open DITA topics, and save them (if you wish) as .fm files. You could also open DITAMAPS, and save them as FrameMaker book files, or as composite (monolithic) .fm files. There is no need to write a parser.
PS: I am talking about FM 12.