Microsoft Word to Org-mode - perl

I am trying to put the Microsoft Word document in emacs using org-mode. I have copied the Word Document and pasted in emacs. I like to achieve the headings like 7.1.2.4 in org-mode format.
and then link the TOC to appropriate headings. How I can do that? Any suggestions? Any programming language like Perl has done it?
Thanks.

There is ODT2ORG (https://bitbucket.org/josemaria.alkala/odt2org/wiki/Home) which lets you import odt files in org-mode.
Use Openoffice/Libreoffice to produce an .odt from your .doc.
Use odt2org to get an .org.
About the headings: I am not entirely sure I understand you.
there is org-toc.el included in org-mode that provides a seperate buffer with a TOC of your current document (like in Reftex). All the entries there are already links to the individual headings. Also, an exported document will have a TOC included by default without your intervention.
Orgmode does not support automatically numbered headings (yet). However, if you want to export your document to html, docbook, latex, or pdf, your headings will appear numbered and nested (you can tweak the settings quite a lot).
I doubt that you will get your intended result purely automatically but it should work 70% automatically, especially if you have latex installed and simply want to have a good-looking pdf in the end. Convert doc to odt, convert odt to org, open and type "C-c C-e d".

Another option: Save as an HTML file, then use Pandoc to convert the HTML to an .org file.

I've converted loads of Word documents into Org files. It takes minutes to do it by hand.
If you want cross-references, use internal links (4.2 in the current manual).
The * and ** style headings are always likely to be there in Org. Think of the use case where exports are compiled from #+INCLUDEd files, or you have done a selective export using tags. Any kind of single sourcing technology isn't going to display the numbering.

There is a ruby gem which converts doc to md. With pandoc you can convert to org.
https://github.com/benbalter/word-to-markdown

Related

Is there a way to use org-mode TODO settings in another major mode? e.g. Markdown?

Is there a way to use org-mode TODO settings in another major mode? For example, is there a way to get a Markdown file I'm working on to have TODOs listed in it?
I'm not aware of org-mode logic being applied to other markup formats like markdown. It definitely is an intriguing possibility.
If you find markdown necessary, you could convert markdown to org-mode syntax and back. Pandoc can convert from markdown to org-mode.
pandoc -f markdown -t org project-notes.md
You can then export from org-mode to markdown using C-c C-e m. You would probably need to tweak the process heavily to avoid thrashing of artifacts between converters since the conversion back and forth is far from perfect.
Personally, I would choose to use an org-mode file as an authoritative copy and export from that as needed since it is more powerful.

Include *prewritten* documentation in Doxygen

To distinguish this question from Doxygen: Adding a custom link under the "Related Pages" section which has an accepted answer that is not a real answer to the question, I specifically add prewritten to the question.
What I want:
Write one document tex file (without preamble, since this file will be \input-ed into a full document)
Import the document into Doxygen's HTML output.
Using Doxygen to produce tex file will probably not work, since it does too much layout work [This holds for its HTML output too like empty table rows 2015]. If Doxygen takes some other input that can easily be transformed into LaTeX, that will do.
You can easily add an already existing Latex file to your doxygen documentation using \latexonly\input{yourfile}\endlatexonly.
I would assume you put it e.g. under a doxygen \page.

export from Emacs in rtf using current font-lock?

I'd like to export my latex files (and maybe other files) as rtf files, so that the syntax highlighting will be kept (that is - keep it 'plain text' with colors).
I tried using Org-mode to convert to HTML (thinking later to copy it into LibreOffice and hope for good results), but I couldn't make Org export it with the correct colors.
Is there a way to export buffer to rtf using the current font-lock ?
I think you're looking for M-x htmlfontify-buffer.
You can also take a look at http://www.emacswiki.org/emacs/Htmlize

How to use SyncTeX with Org-mode?

Background
With SyncTeX you can get forward and backward search between a source document and the typeset material. More specifically:
Forward search is to jump from a particular place in the source document, e.g. a LaTeX file, to the corresponding place in the typeset material, e.g. a PDF file.
Backward search is to jump from a particular place in the typeset material, e.g. a PDF file, to the corresponding place in the source document, e.g. a LaTeX file.
With Org-mode you can export as LaTeX and process it to PDF.
Question
It would be useful to be able to do forward and backward search between an Org-mode file and the PDF it produces on LaTeX export. Is this possible?
As mentioned, SyncTeX already implements forward and backward search between a LaTeX file and its resulting file. So the missing link seems to be the jump between the Org-mode file and the LaTeX file it is exported as.
I found a similar question on the mailing list: [Orgmode] synctex!! ...syncorg? It got no answer involving a solution.
There is a recent (April 2013) thread on the org-mode mailing list which has some preliminary patches. However, reading the emails, it seems like it's a tricky problem.
There is a more recent (depending on your frame of reference) post from October 2013 which has a solution. However, I have not been successful with that code, and re-raised the issue in this thread.

How to generate Microsoft Word documents using Sphinx

Sphinx supports a few output formats:
Multiple HTML files (with html or dirhtml)
Latex which is useful for creating .pdf or .ps
text
How can I obtain output in a Microsoft Word file instead?
With another doc generator I managed to generate a single html output file and then convert it to Microsoft Word format using the Word application.
Unfortunately I don't know a way to generate either Word or the HTML single-page format.
The solution I use is singlehtml builder like andho mentioned in the comment, then convert the html to docx using pandoc.
The following sample assumes the generated html would be located at _build/singlehtml/index.html
make singlehtml
cd _build/singlehtml/
pandoc -o index.docx index.html
There is a Sphinx extension for generating docx format (which I haven't tested) and a newer one (which I also haven't tested, but looks like it is more actively maintained)
To convert files in restructured text to MSdoc, I use rst2odt and next unoconv. Look next script:
#!/bin/sh
rst2odt $1 $1.odt
unoconv -f doc $1.odt
rm $1.odt
With rst2odt you can use your own stylesheet: unoconv comes with OpenOffice and also allows to apply an Open Office style (template) during the conversion. Simply edit a converted document, change styles, add headers and footers, save that as an ODF Text Document Template (OTT) and use this as part of the conversion, like:
unoconv -f doc -t template.ott $1.odt
to use that template for various conversions later on.
I realize this is an old question, but I found that LibreOffice supports the following way of doing conversion (assuming soffice.exe is in your path):
soffice.exe --invisible --convert-to doc myInputFile.odt
Some things I have read say to use the --headless option rather than --invisible. Both seem to work on Windows.
You can start with the rst2odt.py script and then do the above to convert to an MS Word document.
Here is a link with additional start up options for LibreOffice:
http://help.libreoffice.org/Common/Starting_the_Software_With_Parameters
Here is a link with file types supported by OpenOffice which, I believe, LibreOffice should also support:
http://wiki.services.openoffice.org/wiki/Framework/Article/Filter/FilterList_OOo_3_0
This answer is not a command line solution and it is not obviously the best, but it simply works for me and save my time. After generating html file 1, you can open the generated html with a browser and copy the entire page (Crtl + a and Ctrl+ c) and then run Microsoft Office(or use live version if you don't have Microsoft Windows, like me) and paste (Ctrl+v) to it.
The best option might be rst -> odt -> doc
Convert the sphinx documents into openoffice format.
Then convert open the odt with openoffice and saved to Word. But I don't know how to do this automatically.
This is a workaround using Calibre (https://calibre-ebook.com), which includes a powerful converter. This worked well and most of the formatting are preserved:
Generate epub output in Sphinx make epub
Import epub output into Calibre and then convert epub to docx using inbuilt ebook converter.
Answer is too late for the original question, but people looking at the same problem may find this useful.
I don't now what Sphinx is, but you could create a rtf file or html file or something similar.
See the following blogpost for more information/approaches : OFFICE AUTOMATION
and from there : How to use ASP to generate a Rich Text Format (RTF) document to stream to Microsoft Word
This article describes how you can generate Rich Text Format (RTF) files with ASP script and then stream those files to Microsoft Word. This technique provides an alternative to server-side Automation of Microsoft Word for run-time document generation.
You don't use ASP script (who does :-) ), but for the idea.