I use VS code for lots of things, from actual programming to just taking notes or conspecting. A few features I would really love would be
ability to insert images in between the text
live, cell based auto formatting for text that forms a small table
Both of these can be done in rich text editors like ms word by inserting just images or an excel table.
Now I realize that VS code is rather file format agnostic and it wouldn't make a whole lot of sense to somehow have an excel table in the middle of a .js file but I think there is a way.
For example, JSDoc is a kind of add-on format that lives entirely within js comments. Same could be done with tables and/or images. Of course there wouldn't be a universal way to encode this but just like JSDoc it could be adapted to different language environments, be it a php file or c file or plain text.
For example, in raw text format, an auto formatted table of data could look something like this within a javascript file:
const my_data = [
/*!!! auto_format_csv(Title, Description, Weight) !!!*/
"Apple", "A nice fruit", 1,
"Car", "A motorized vehicle", 2000
/*!!! auto_format_end !!!*/
];
and while editing the file in vs code, it could look something like this:
So to the question part: are there perhaps any extensions that already do this sort of thing? If not, is it possible to create such an extension with the liberties given to extensions as of right now?
I know that vs code is based on electron and open source so in theory, everything is possible but I want to have these features as easily as possible so having framework support for this would likely help a lot.
Related
I am trying to convert MS Word file to chm file. I have a well organized word document. But,I could not figure out how to word saved as a html file to chm file. I know I can add html file to created project but there are some issue such that I could not solve how to convert ms word table of content file to index file in html help workshop program. I would be very happy If someone provide some example about conversion of word documents.(I am trying to achieve this thorough HTML Help Workshop program)
Best regards,
Converting a Word document to CHM format is difficult without special (often expensive) tools and has a learning curve.
You should think about whether the PDF format is not sufficient. But the CHM format - integrated in the Windows operating system - has of course some popular functions.
I recommend to read through Search and Index not working after converting from Word 2016 to CHM.
As I mentioned in my answer I never used chmProcessor before (because using other tools) but surprisingly seems to be a good one for converting Word documents in a simple way.
Please try chmProcessor for your needs. You may want to ask a new question here on SO later.
Edit:
Maybe you have additional interest in the following CodeProject article:
How to Easily Write a User's Guide for Your Application using Different File Extensions
I am playing with Tal's intro to producing word tables with as little overhead as possible in real world situations. (Please see for reproducible examples there - Thanks, Tal!) In real application, tables are to wide to print them on a portrait-oriented page, but you might not want to split them.
Sorry if I have overlooked this in the pandoc or pander documentation, but how do I control page orientation (portrait/landscape) when writing from R to a Word .docx file?
I maybe should add tat I started using knitr+markdown, and I am not yet familiar with LaTex syntax. But I'm trying to pick up as much as possible while getting my stuff done.
I am pretty sure the docx writer has no section breaks implemented, also as far as I understand --reference-docx allows for customizing styles and not the page layout (but I might also be wrong here), this is from pandocs guide on --reference-docx:
--reference-docx=FILE
Use the specified file as a style reference in producing a docx file.
For best results, the reference docx should be a modified version of a
docx file produced using pandoc. The contents of the reference docx
are ignored, but its stylesheets are used in the new docx. If no
reference docx is specified on the command line, pandoc will look for
a file reference.docx in the user data directory (see --data-dir). If
this is not found either, sensible defaults will be used. The
following styles are used by pandoc: [paragraph] Normal, Title,
Authors, Date, Heading 1, Heading 2, Heading 3, Heading 4, Heading 5,
Block Quote, Definition Term, Definition, Body Text, Table Caption,
Image Caption; [character] Default Paragraph Font, Body Text Char,
Verbatim Char, Footnote Ref, Link.
Which are styles that are saved in the /word/styles.xml component of the docx document.
The page layout on the other hand is saved in the /word/document.xml component in the <w:sectPr> tag, but pandoc's docx writer ignores this part as far as I can tell.
The docx writer builds by default a continuous document, with elements such as headers, paragraphs, simple tables and so on ... much like a html output.
Option #1 (doesn't solve the page orientation problem):
The only page layout option that you can define through styles is the pageBreakBefore which will add a page break before a certain style
Option #2 (seems elegant but hasn't been tested):
Recently the custom writer has been added that allows for a custom lua script, where you should be able to define how certain Pandoc blocks will be written into the output file ... meaning you could potentially define section breaks and page layout for a specific block inserting the sectPr tag into the document. I haven't tried this out but it would be worth investigating. On pandoc github you can check out a sample lua script file for custom html output.
However, this means, you have to have lua installed, learn the language, and it is up to you if you think its worth the time investment.
Optin #3 (a couple of clicks in Word might just do):
As you will probably spend quite some time setting up how to insert sections and what would be the right size, margins, and figuring how to fit the table to such a layout ... I recommend that you use pandoc to put write your document.docx, that you open in Word, and do the layout by hand:
select the table you want on the landscape page
go to Layout > Margins
> select Apply to: Selected text
> choose Page Setup > select Landscape
Now a new section with a landscape orientation should surround your table.
What you would anyway also probably want to do is styling the table and table caption a little (font-size,...), to achieve the best result (all text styling can be already applied with pandoc where --reference-docx comes handy).
Option #4 (in situation when you can just use pdf instead of docx):
As far as I could figure out is that with pandoc does a good job with tables in md -> docx (alignment, style, ... ), in tex -> docx it had some trouble sometimes. However if your option allows for a pdf output latex will be your greatest friend. For example your problem is solved as easily as just using
\usepackage{pdflscape}
and adding this around your table
\begin{landscape}
...
\end{landscape}
This are the options that I could think of so far.
I would always recommend using the pdf format for reports, as you can style it to your liking with latex and the layout will stay the way you want it to be.
However, I also know that for various reasons word documents are still the main way of reviewing manuscripts in many fields ... so i would most likely just go with my suggested option 3, mostly cause it is a lazy and quick solution and because I usually don't have many documents with tons of giant tables with awkward placement and styling.
Good luck ;-)
Based on Taleb's answer here and some officer package functions, I created a little gist that one can use like this:
---
title: "Example"
author: "Dan Chaltiel"
output:
word_document:
pandoc_args:
'--lua-filter=page-break.lua'
---
I'm in portrait
\endLandscape
I'm in landscape
\endPortrait
I'm in portrait again
With page-breaks.lua being the file hosted here: https://gist.github.com/DanChaltiel/e7505e62341093cfdc489265963b6c8f
This is far from perfect (for instance it won't work without the last portrait section), but it is quite useful sometimes.
I am using a MATLAB script to tune the control system on a machine. When the tuning is complete, I would like a report containing text (especially serial number, date/time and the values determined during tuning) and plots, especially transfer functions.
What do to you recommend?
Whatever solution I use should be compatible with the MATLAB compiler so I can distribute my solution to a team of field engineers.
Ideally the report will be a PDF document.
The MATLAB report generator does not seem to be the right product as it appears that I have to break up my script into little pieces and embed them in the report template. My script contains opportunities for the user to intervene and change values or reject the tune if plots don't look right and my hunch is that this will be difficult if the code runs from the report generator. Also, I fear code structure and maintainability will be lost if the code structure is determined by the requirements of the report template.
Please comment if my assumptions are wrong.
UPDATE
I have now switched to use the MATLAB Report Generator with release r2016b and it is working very well for my compiled code users. Unfortunately it means that colleagues who have a MATLAB licence need to buy the Report Generator too, to use my tools scripted.
As the MATLAB Report Generator's development manager, I am concerned that this question may leave the wrong impression about the Report Generator's capabilities.
For one thing, the Report Generator does not require you to break a script up into little pieces and run them inside a template. You can do this if you choose and in some circumstances, it makes sense, but it is not a requirement. In fact, many Report Generator applications use a MATLAB script or program to interact with a user, generate data in the MATLAB workspace, and as a final step, generate a report from the workspace data.
Moreover, as of the R2014b version, the MATLAB Report Generator comes with a document generation API, called the DOM API, that allows you to embed document generation statements in a MATLAB program. For example, you can programmatically create a document object, add and format text, paragraphs, tables, images, lists, and subdocuments, and output Microsoft Word, HTML, or PDF output, depending on the output type you select. You can even programmatically fill in the blanks in forms that you create, using Word or an HTML editor.
The API runs on Windows, Linux, and Mac platforms and generates Word and HTML output on all three, without the use of Word. On Windows, it uses Word under the hood to produce PDF output from the Word documents that it generates.
The latest release of the MATLAB Report Generator introduces a PowerPoint API with capabilities similar to the DOM API. If you need to include report generation in your MATLAB application, please don't rule out the MATLAB Report Generator based on past impressions. You may be surprised at just how powerful it has become.
I've done this quite a bit. You're right that MATLAB Report Generator is typically not a great solution. #Max suggests the right approach (automating Word through its COM interface), but I'd add a few extra comments and tips, based on my experiences.
Remember that if you're going with this solution, you are depending that your end-users will be running Windows, and have a copy of Office on their machine. If you want to ultimately produce a PDF report, that will need to be Office 2010 or above.
I would bet that you'll find it easier to automate the report generation in Excel rather than Word. Given that you're producing a report from MATLAB, you'll likely be wanting quite a lot of things in tables of numbers, which are easier to lay out in Excel.
If you are going to do it in Word, the easiest way is to first (without MATLAB) create a template .doc/.docx file, which contains any generic text that will be the same for all reports and blank tables for any information. Turn on track changes, and insert empty comments at each point that you will be filling in information. Then within your report creation routine in MATLAB, connect to Word and iterate through each comment, replacing it with whatever data you wish.
If you are learning to automate Excel from MATLAB, this page from the Excel Interop documentation is really helpful. There's an equivalent one for Word.
Unlike #Max, I've never had good results by saving figures to an .emf file and then inserting them. In theory that does preserve editability, but I've never found that valuable. Instead, get the figure looking right (and the right size) in MATLAB, then copy it to the clipboard with print(figHandle, 'dbitmap') and paste to Excel with Worksheet.Range('A1').PasteSpecial.
To save as a PDF, use Workbook.ExportAsFixedFormat('xlTypePDF', pathToOutputFile).
Hope that helps!
I think you are right about the report generator.
In my opinion the fastest/easiest approach would be to generate the report in a html document. For that you just need the figures and write a text file, conversion should be trivial.
Quite similar approach would be to create a Latex file. And then create a pdf from it - though for this you'd need to install latex on your deployed machines.
Lastly you could use the good integration of Java in Matlab. There are several libraries you could use - like this. But I wonder if all the complication will be worth it.
Have you considered driving Microsoft Word through its ActiveX interface? I've done this in compiled Matlab programs and it works well. Look at the Matlab help for actxserver(): The object you want to create is of type Word.Application.
Edit to add: To get figures into the document, save them as .emf files using the -dmeta argument to print(), then add them to the document like this:
WordServer.Selection.InlineShapes.AddPicture(fileName);
I am looking for a tool or set of tools to convert between file formats D and M where
D is a format handled by MSWord, in order of preference, docx, doc, rtf
M is a lightweight markup, such as markdown, textile, txt2tags, it can be an esoteric one
there is a way to generate html from M
conversion is two-way, it's done both from D to M, and from M to D
utf-8 encoding is handled properly
the content is simple, paragraphs, some simple formatting like bold and italics, maybe lists
the tools are platform-independent
What I've found so far
TeX, LaTeX -- too heavyweight
docx2txt -- too lightweight, it supports no formatting at all
html -- MSWord produces bloated html
a few one-way conversions, like doc to mediawiki,
UPDATE:
The use case is a document workflow between technical and non-technical people
I, the technical guy edit a document in plain text, put it into version control, etc.
I send it to my manager or other non-technical people
They add comments, make changes to it using their Word, then they send it back to me
I want to simply grok their changes, make my changes, put it into version control, without having to use Word
I think that Pandoc much more than meet all requirements.
http://pandoc.org
Adam, I've used docx4j to convert docx to html, edit the html in CKEditor, and then use docx4j to convert the html back to docx. My process made some assumptions about the css (ie it was designed to handle docx4j's clean html, and editing in CKEditor).
You don't say whether there is a way to generate M from HTML?
This is probably hard to do two-way, since you will have impedance mismatches between the various formats.
The best world I can think of would be a sort of Wiki / Word hybrid: Maybe you can get Google Wave to do that for you?
Another solution that might work is a CMS like Plone (did they ever add WYSIWIG capability? I stopped caring after version 1). Keep your documents there. Let the system handle changes, annotations etc. You can automate retrieval of the source (should be ReStructuredText) and commit that to your source control if you have to.
This script I wrote might help you in your workflow:
https://github.com/matb33/docx2md
It is a command-line PHP script that will only work with .docx files. It will extract the XML, run some XSL transformations, and provide you the result in Markdown format.
I encourage you to send me .docx files that don't convert accurately. I'd love to make this script as robust and reliable as possible.
I am using MS Word API to generate .docx which contains the data fetched from DB, in which i am applying the respective styles, fonts, symbols, etc. If the data fetched from the DB is quite huge, then there is a problem in displaying those data in the .docx file. I found that internally MS Word 2007 will write some content through tags which may not be needed to display the data. Hence i am figuring out what are the necessary MS Word tags needed when converting into a .xml file. So that i can avoid unnecessary tags and build only the respective tags which are needed to display the data. Hence i am planning to write my own .xml with the MS Word tags which are needed, than generating a .XML from .docx file
My queries are:-
1) Whether it is right that the MS Word will generate some tags which may not be needed during the conversion of .docx to document.xml? That makes it heavy? If so what are the tags , so that i can avoid them when write by own .xml file.
2) Please send links to understand about the MS Word tags and its advantages, which tags are needed and which are not ?
3) Whether my approach to write a new .xml similar to document.xml (.docx conversion) is worthy one to go forward so that i can build the .xml with the tags i needed , so that i can improve the performance of the data display?
Please shed some light into it and thanks in advance..
Thanks,
Rithu
You'll want to learn WordprocessingML in much more detail to do this. It certainly isn't impossible, but it is quite a learning curve to start with. Probably the best place to start is with this eBook. If you go the manual route, you'll need a zip technology. If you're in Visual Studio, you can make the writing of all of this easier by using the Open XML SDK.
As to your questions on 'unnecessary tags', it's hard to believe that there would be much at all in the file that is unnecessary. But that depends on what you consider not needed - for example, if a word is caught as mispelled, there will be "dirty=1" attribute on the Run tag. If you're okay with displaying mispelled words, then that could be considered unnecessary. Really depends on what you're displaying for and in what.