Convert rtf files to chm files ? Convert hlp files to chm files? - command-line

We were shipping .hlp files to customers when development was in VC++. The process to create it was as follows:
1. Create rtf file
2. Create new project in WinHelp and then compile to get .hlp file.
Now development has moved to .net and also I found that we can no longer open .hlp files in windows 7 or vista.
I wanted to know if there are any free command line tools using which we can convert these .hlp files to a .chm file ?
Also I wanted to know if there are any free command line tools to convert .rtf file to .chm ?

Microsoft has a tool which can convert Win Help projects to HTML Help. It is called HTML Help Workshop. You can open the existing .hpj project file with it and choose the option to convert it to HTML Help project .hhp. You can then compile the .hhp project with the same tool to generate the .chm file.
There are however many shortcomings in the tool. It generates an HTML page for each page in the rtf file but the naming of these HTML pages is random causing future referencing to be difficult.
If you just have the .hlp file and not the original Win Help project files, you can use a decompiler to generate the .hpj and .rtf files first and then convert them using HTML Help Workshop.
I found the following link quite helpful:
http://www.help-info.de/en/Help_Info_WinHelp/hw_converting.htm
EDIT: there are some 3rd party convertors and Help Authoring Tools (HATs) also available which may do the job better than HTML Help Workshop but most of them are not free.

Keep in mind that CHM is compiled HTML, and not very related to html, so your main problem is conversion of rtf to html
I would try to convert RTF to HTML, but on a topic per file.
What you could try is to input the RTF into word and try to save as HTML, and then use a program/script to split out the various topics to individual files and fixup references.
Then compile the result with a CHM compiler (like MS htmlhelp workshop)

Related

Get documentation from GitHub project as a single pdf

I'm looking for a single pdf of the ErpNext and Frappe user manuals.
Documentation seems to be provided in html and the source is in markdown. I did find tools to convert markdown to html/pdf, but no reliable solution to generate a SINGLE pdf file keeping the structure as shown here:
Put more abstractly: How to transform GitHub markdown documentation (organized in subdirectories) into a single pdf file?
Could anyone help me out?
Any way of achieving this is welcome, thanks in advance!
You can convert markdown to PDF with Pandoc or similar tools.
You can fsearch the internet about how to concatenate files on your OS.
There are several (online) tools to merge multiple PDFs into one.
To create a single file you can either
concatenate the markdown files into one big file, then convert to PDF, or
convert all markdown files to PDF, then merge all PDF files into one big PDF.

Programmatically convert Doc(x) files to PDF using Microsoft Word

We are developing a Java application that needs to programmatically convert .rtf, .doc and .docx files to PDF files.
Formatting is important to us, so we need the page numbers to be the same between a source file and a target PDF file, and the contents of each page being the same as the original file.
We have tried out open source solutions, such as JODConverter to invoke a LibreOffice of OpenOffice installation, Docx4j and XDocReport. The best formatting was achieved with LibreOffice. However, even in that case, the pages were different (for example, a 87-page .rtf file results in an 80-page PDF file).
So, we think that the ideal way to make the conversion would be to somehow invoke Microsoft Word though our Java application, and make the conversion with it. That would produce PDF files that have the same formatting as the original files.
Is this possible in any of the following ways:
An API that is directly invokeable through Java?
An API that is invokeable through a .Net language and we would use that with something like JACOB?
A 3rd party library that uses a Microsoft Word installation under the hood (something like JODConverter for Word)?
A CLI interface supported by Word (relevant question)?
Something else?

MS Word files manipulation in ColdFusion 10/11

I am working on a small project where I have to read more than 100 MS Word files and loop through each file and update their headers and footers. I want to accomplish this task in ColdFusion 10/11.
Is there any way I can get this done in ColdFusion?
RIA Forge has a tool called
http://docxextractor.riaforge.org/
Which pulls data from docx files. It does not create docx files however
Disclaimer: I wrote this
The <cfdirectory> tag can be used to work with files in a directory, for example getting a list to process using <cfloop>.
ColdFusion does have some support for MS Office files. What you're trying to do can be done for an Excel spreadsheet, reading the file with <cfspreadsheet> and then using functions such as SpreadsheetSetFooter(), before writing the file.
However, there are no comparable functions for Word files!
Adobe ColdFusion documentation

PDF-Express Error: Font symbol is not embedded

I am not sure it is the right place to ask such a question, sorry.
I have libre office, and a paper, which is written using a IEEE format.
Now when i try to export to PDF, and try to pass pdf-express it fails with error
Font Symbol is not embedded 10x
I do not know where is the problem, there is only 1 font: Times New Roman, of course different sizes.
I tried "Export as PDF..." and checked "Embed Fonts", but no chance so far.
A month ago, i tried the same paper with OpenOffice, and i do not remember such error, now i become to a situation that i have to change paper a bit, and try the same paper with LibreOffice i get this error. Is this error about LibreOffice?
Look at this answer, really simple!
How to repair a PDF file and embed missing fonts
Also, my comment as follows :)
On win32, if you have installed ghostScript, the command may look like:
gswin32c -sFONTPATH=C:\Windows\Fonts -o output-pdf-with-embedded-fonts.pdf -sDEVICE=pdfwrite -dPDFSETTINGS=/prepress input-pdf-where-some-fonts-are-not-embedded.pdf
(find the exe file on your system, maybe add it to PATH -- the environment variable, if necessary)
Open this PDF file with Adobe acrobat, then choose file->print. Use adobe PDF as the printer to print the file and save it as pdf file. All fonts will be embedded.
I also faced the same problem and I think simply by creating the PDF file using PDF express using your source file is the simplest and easiest solution. If you are using latest then just zip or rar your source file (dvi file, eps etc.) and then just build the pdf file using PDF Express. This will solve your problem. I have found one article IEEE PDF Express Error Message – Font is not Embedded Solution which can help you in this regard.
Generate ps from pdf using pdftops, using Xpdf.
Use Ghostscript to embed fonts:
gsWin64 -dSAFER -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sPAPERSIZE=a4 -dPDFSETTINGS=/printer
-dCompatibilityLevel=1.4 -dMaxSubsetPct=100 -dSubsetFonts=true
-dEmbedAllFonts=true -sOutputFile=d:\Output_filename.pdf Input_filename.ps

What's the easiest way to generate DOC files?

Right now I'm generating HTML with a Perlscript, and then manually converting to DOC in OpenOffice. Actually I have to copy, create new "Text document", paste, save, as it treats HTML and DOC as separate file types, but that's quite unessential. That's very inconvenient.
Is there any automated way I can convert HTML to decent DOC, or some other nice format like HTML I can generate textually and convert to DOC in automated way?
(I'm on OSX)
I can't help you get to .doc, but have you seen the Open XML Format SDK from Microsoft? This will allow you to generate Office 2007 format documents (.docx, .xlsx etc) from .NET code.
Theoretically you may have some luck with this under Mono on OS X, as it doesn't require an installation of Office 2007 (for Windows) to function.
Not sure if this is what you want, but you can fairly easily generate WordML documents with code. WordML is the Word 2003 XML file format. It's NOT the same thing at the Office 2007 Open XML formats. WordML is just one file that's not too hard to create if your just doing fairly basic formatting. You could generate it directly rather than creating the HTML first. You can name the files with a .DOC extension and Word 2003 and later will open them just fine. You can resave them as real .DOC file if you want.
Here's the on-line WordML reference. I can send you some sample code if you'd like.
http://msdn.microsoft.com/en-us/library/aa212812(office.11).aspx
If you really want to create a general file format that could be converted into other formats, creating XML-FO file might be the way to go. There are a number of products out there that can take XML-FO and transform it into other files, such as Word and PDF.
We do use the components of Aspose that are available for .NET and Java. With Java you should be able to use them on OS X, too.
You have to purchase the components (i.e. they are not free), but aside from this, they are really great.