is it possible to collect \section's in \mainpage? - doxygen

I need to present documentation with a substantial \mainpage, but I would like keep the various \section, \subsection, etc. near the code implementing them.
The best I could do is to mark the various fragments with \page and include them with \subpage.
This is suboptimal because it actually creates another page (unsurprisingly!) while I would like to have sections/subsections inline.
What I have now is:
a \mainpage with some text and a bunch of
\subpage xxx
documentation scattered around in the sources including:
\file per-file documentation.
class/function/whatever specific documentation.
occasional \page xxx to be referenced in \mainpage.
This works and I can construct meaningful documentation.
Problem is \subpage construct generates a new page (in HTML) or a separate chapter (in LaTeX/PDF); I would like to be able to control it better and possibly to put all "subpages" together in the same "chapter" as if I used \section, \subsection and similar constructs.
Current structure is kind-of-OK in HTML, but quickly becomes unmanageable in LaTeX, especially if I start using \subpage links in subpages to simulate \subsection, \subsubsection and friends.
Is it possible at all?

Related

How is Perl useful as a metadata tool?

In The Pragmatic Programmer:
Normally, you can simply hide a third-party product behind a
well-defined, abstract interface. In fact , we've always been able to
do so on any project we've worked on. But suppose you couldn't isolate
it that cleanly. What if you had to sprinkle certain statements
liberally throughout the code? Put that requirement in metadata, and
use some automatic mechanism, such as Aspects (see page 39 ) or Perl,
to insert the necessary statements into the code itself.
Here the author is referring to Aspect Oriented Programming and Perl as tools that support "automatic mechanisms" for inserting metadata.
In my mind I envision some type of run-time injection of code. How does Perl allow for "automatic mechanisms" for inserting metadata?
Skip ahead to the section on Code Generators. The author provides a number of examples of processing input files to generate code, including this one:
Another example of melding environments using code generators happens when different programming languages are used in the same application. In order to communicate, each code base will need some information in commondata structures, message formats, and field names, for example. Rather than duplicate this information, use a code generator. Sometimes you can parse the information out of the source files of one language and use it to generate code in a second language. Often, though, it is simpler to express it in a simpler, language-neutral representation and generate the code for both languages, as shown in Figure 3.4 on the following page. Also see the answer to Exercise 13 on page 286 for an example of how to separate the parsing of the flat file representation from code generation.
The answer to Exercise 13 is a set of Perl programs used to generate C and Pascal data structures from a common input file.

What is the Mathworks way to generate Matlab HTML documentation?

I am working on shared Matlab code and we would like to share a generated documentation as searchable HTML documents within our local network.
I know of the following methods to generate a documentation:
Write a converter to C++-like files. This is done in in Using Doxygen with Matlab (Last updated 2011) and mtoc++ (last updated 2013). The C++-like files are then parsed by Doxygen.
Use Python's sphinxcontrib-matlabdomain to generate a HTML documentation.
Use m2html which is also a third-party solution.
Further options are listed in this Q&As: One, Two and Three.
All possibilities are not supported by Mathworks. All possibilities need me to mention i.e. the parameters of a function myself. They do not analyze the code in the sense, Doxygen does it for i.e. Java:
//! an object representation of the advertisement package sent by the beacon
private AdvertisementPackage advertisementPackage;
I heard of Matlab's publish() function, but I did never see it used in the aforementioned sense.
Question: What is the Mathworks way to generate Matlab HTML documentation. Can the code itself be analyzed? Can I use the information provided to the Matlab Input Parser already? Please mention your personal preference in comments.
Example:
%% Input parser
p = inputParser;
addRequired(p, 'x', #isnumeric);
validationFcn = #(x) (isnumeric(x) && isscalar(x));
addRequired(p, 'fftSize', validationFcn);
addRequired(p, 'fftShift', validationFcn);
validationFcn = #(x) (isa(x, 'function_handle'));
addRequired(p, 'analysisWindowHandle', validationFcn);
parse(p, x, fftSize, fftShift, analysisWindowHandle);
I think you've researched this topic well (how to generate HTML documentation from MATLAB functions), now it's up to you to choose which method works best for you.
The publish function could be used to author documentation. You write regular M-files with specially crafted comments (in fact the file could be all comments with no code), then you publish the file to obtain rendered HTML (it also supports other targets such as PDF, DOC, LaTeX, etc...). Think of it as a simpler MATLAB-specific version of Markdown which used here on Stack Exchange sites to format posts.
One aspect you didn't mention is integrating the generated documentation into the builtin Help viewer. This is done by creating info.xml and demos.xml files, and organizing the documentation in a specific way. You could also make your custom docs searchable by building Lucene index files using builddocsearchdb function (which internally powers the search functionality in MATLAB custom docs). Note that it doesn't matter how you generated the HTML docs (you could have used publish or even manually written HTML files).
In fact the publish-based workflow is extendable, and you could use it in interesting ways by creating custom XSL template files to transform and render the parsed comments. For example I've seen it used to render equations using MathJax instead of relying on the built-in solution. Another example is publishing to MediaWiki markup (format used by Wikipedia). Other people use it to write blog posts (see the official blogs on the MATLAB Central which are creating this way), or even generate text files later processed by static site generators (like Jekyll and Octopress frameworks).
As far as I know, there are no public tools available that inspect MATLAB code on a deeper level and analyze function parameters. Best I could come up with is using reflection to obtain some metadata about functions and classes, although that solution is not perfect...
MathWorks seems to be using their own internal system to author HTML documentation. Too bad they don't share it with us users :)
I think this is the officially santioned Mathworks' way of writing documentation:
http://www.mathworks.co.uk/help/matlab/matlab_prog/display-custom-documentation.html
Basically write the HTML, and add a bunch of files to make it searchable and displayable in the MATLAB documentation.
there is an easy way to use publish with a function & it's corresponding inputs. look at publish('test',struct('codeToEvaluate','test(inputs);','showCode',false,
)).

Are there reliable algorithms for generating robust DOM node selectors, given only the target node?

In writing a scraper, we typically use some kind of selector to identify particular nodes of interest. Ideally the selectors should continue to work even as the page changes over time. A lot of the common approaches like grabbing nodes by id are fragile on frequently updated pages and impossible on some nodes. I'm trying to find good algorithms for generating robust selectors, but since there doesn't seem to be a standard terminology for this problem, it's hard to find everything that's out there.
Here are the selector DSLs I already know.
XPath selectors - Implemented everywhere from JS to the popular
Python and Ruby scraping libraries.
CSS selectors - Found in many of the places where you can find xpath
selectors.
High level selectors - Here I'll give the example of Chickenfoot,
which allows users to write click("begin tutorial") to find a link
with the text "Begin Tutorial." Usually these are implemented on top of
xpath and CSS selectors. I'd love to find out about more members of
this language family.
Visual selectors - This would be the approach taken by, for instance,
Sikuli, which makes it appear as though the program is calling a
function on a screengrab of the relevant node. I don't know any
web-specific instances of this approach, but I imagine there are
some.
Here are the selector generation algorithms I already know. By a selector generation algorithm I mean an algorithm that takes a node as input and produces a robust selector as output.
iMacros: Finds all elements with the same node type and text as the
target element, finds the target element's index in this list list. Uses
the node type, text, and index as the selector. Also includes id
for forms and form elements.
CoScripter: Uses element's text if available. If not, uses preceding
text.
Selenium: Uses id where available. Uses various other attributes
otherwise, such as image alt text, links' displayed texts, buttons'
displayed texts.
Wargo System: Uses element text.
Many systems: Many systems use the xpath from the root to the target node, or some
suffix of that xpath.
All of these selector generation algorithms fail on some nodes. Are there better approaches out there? Or other approaches that I could combine with these algorithms to produce a better hybrid algorithm?
When I started investigating this topic for some work I am doing, I was also surprised by how little information is available on this topic.
I did find this 2003 paper, but unfortunately, I only have access to the abstract:
Abe, Mari, and Masahiro Hori. “Robust Pointing by XPath Language: Authoring Support and Empirical Evaluation.” In Proceedings of the 2003 Symposium on Applications and the Internet, 156 – . SAINT ’03. Washington, DC, USA: IEEE Computer Society, 2003.
For my own use, I followed the approach in Tim Vasil's 50-line jquery plugin. I won't reproduce the code which is available at that link, but instead I'll describe it:
It recursively traverses up the DOM tree from the element, building a selector "backwards". At each level:
If the node has an ID, just use that and skip all the parents; they aren't added to the selector.
If node has a tag name or a set of classes that is unique among its siblings, use that as the selector. Otherwise, use :nth-child.
Since I will be storing element contents between visits to a page, I'm thinking of implementing some "blunder detection" here, maybe using a percentage change from last visit to detect if the selector may be grabbing the wrong element.

How to diff hierarchical-data?

Are there any tools which diff hierarchies?
IE, consider the following hierarchy:
A has child B.
B has child C.
which is compared to:
A has child B.
A has child C.
I would like a tool that shows that C has moved from a child of B to a child of A. Do any such utilities exist? If there are no specific tools, I'm not opposed to writing my own, so what are some good algorithms which are applicable to this problem?
A great general resource for diffing hierarchies (not specifically XML, HTML, etc) is the Hierarchical-Diff github project based on a bit of Dartmouth research. They have a pretty extensive list of related work ranging from XML diffing, to configuration file diffing to HTML diffing.
In general, actually performing diffs/patches on tree structures is a fairly well-solved problem, but displaying those diffs in a manner that makes sense to humans is still the wild west. That's double true when your data structure already has some semantic meaning like with HTML.
You might consider our SmartDifferencer tools.
These tools compare computer source code files in a diff-like way. Unlike diff, which is line oriented, these tools see changes according to code structure (variable name, expression, statement, block, function, class, etc.) as plausible edits ("move, insert, delete, replace, copy, rename"), producing answers that makes sense to programmers.
These computer source codes have exactly the "hierarchy" structure you are suggesting; the various constructs nest. Specifically to your topic, typically code blocks can nest inside code blocks. The SmartDifferencer tools use target-language accurate parsers to "deconstruct" the source text into these hierarchical entities. We have a Smart Differencer for XML in which you can obviously write nested tags.
The answer isn't reported as "Nth child of M has moved" although it is actually computed that way, by operating on the parse trees produced by the parsers. Rather it is reported as "code fragment of type at line x col y to line a col b has moved/..."
The answer my good sir is: Depth-first search, also known as Depth-first traversal. You might find some use of the Visitor pattern.
You can't swing a dead cat without hitting some sort of implementation for this when dealing with comparing XML trees. Take a gander at diffxml for an example.

Are there any tools to visualize template/class methods and their usage?

I have taken over a large code base and would like to get an overview how and where certain classes and their methods are used.
Is there any good tool that can somehow visualize the dependencies and draw a nice call tree or something similar?
The code is in C++ in Visual Studio if that helps narrow down any selection.
Here are a few options:
CodeDrawer
CC-RIDER
Doxygen
The last one, doxygen, is more of an automatic documentation tool, but it is capable of generating dependency graphs and inheritance diagrams. It's also licensed under the GPL, unlike the first two which are not free.
When I have used Doxygen it has produced a full list of callers and callees. I think you have to turn it on.
David, thanks for the suggestions. I spent the weekend trialing the programs.
Doxygen seems to be the most comprehensive of the 3, but it still leaves some things to be desired in regard to callers of methods.
All 3 seem to have problems with C++ templates to varying degrees. CC-Rider simply crashed in the middle of the analysis and CodeDrawer does not show many of the relationships. Doxygen worked pretty well, but it too did not find and show all relations and instead overwhelmed me with lots of macro references until I filtered them out.
So, maybe I should clarify "large codebase" a bit for eventual other suggestions: >100k lines of code overall spread out over more than 100 template files plus several actual class files pulling it all together.
Any other tools out there, that might be up to the task and could do better (more thoroughly)? Oh and specifically: anything that understands IDL and COM interfaces?
When I have used Doxygen it has produced a full list of callers and callees. I think you have to turn it on.
I did that of course, but like I mentioned, doxygen does not consider interfaces between objects as they are defined in the IDL. It "only" shows direct C++ calls.
Don't get me wrong, it is already amazing what it does, but it is still not complete from my high level view trying to get a good understanding of how everything fits together.
In Java I would start with JDepend. In .NET, with NDepend. Don't know about C++.