I have a nunit3 result file, I need to parse and see if the format is right. I tried with nunit-console, does not have an parser option. How can i do the parse?
Either use a tool that can parse it or write your own.
A web search for a "NUnit Results Viewer" should give you some choices for existing tools.
You can see the format spec here:
https://github.com/nunit/docs/wiki/Test-Result-XML-Format
Related
NUnit console outputs run settings, which include the Test Parameters. I do not want it displayed. Is it possible? I use nunit3-console.
There are no options to change the default console text output, which is aimed at immediate developer feedback rather than reporting. Choices for you would be
Direct the output to a text file and edit it, perhaps automatically.
Create your own report from the XML result file, which contains all the necessary components. You can do this either by writing a resfoundult writer extension to the engine or creating an XSL transform that produces the result you want. I've generally found it best to just pick the technology (C# vs XSLT) that you most enjoy. :-)
Postgres is an open-source project and it has DocBook as default format for its documentation. At a first glance it looks like a tree of *.sgml files in the doc directory inside a repository.
There are several pre-defined convertion output formats, but unfortunately Emacs' native one is ignored.
Does it possible to get Postgres documentation as a postgres.info.gz file?
That's basically nothing more than a text conversion problem. I believe the right solution here would be to write an XSL that converts the XML in your SGML files to TeXinfo source code, but the next best thing:
pandoc is a parser for different textual document file formats. It has a reader for Docbook and a writer for texinfo. That should get you started.
I have a large amount of PDF files in my local filesystem I use as documentation base and I would like to create an index of these files.
I would like to :
Parse the contents of the PDF files to get keywords.
Select the most relevant keywords to make a summary.
Create static HTML pages for some keywords with entries linked to the appropriate files.
My questions are :
Is there an existing tool to perform the whole job ?
What is the most appropriate tool to parse PDF files content, filter (by words size) and counting the words?
I consider using Perl, swish-e, pdfgrep to make a script. Do you know other tools which could be useful?
Given that points 2 and 3 seem custom I'd recommend to have your own script, use a tool out of it to parse pdf, process its output as you please, and write HTML (perhaps using another tool).
Perl is well suited for that, since it excels in processing that you'll need and also provides support for working with all kinds of file formats, via modules.
As for reading pdf, here are some options if your needs aren't too elaborate
Use CAM::PDF (and CAM::PDF::PageText) or PDF-API2 modules
Use pdftotext from the poppler library (probably in poppler-utils package)
Use pdftohtml with -xml option, read the generated simple XML file with XML::libXML or XML::Twig
The last two are external tools which you use via Perl's builtins like system.
The following text processing, to build your summary and design the output, is precisely what languages like Perl are for. The couple of tasks that are mentioned take a few lines of code.
Then write out HTML, either directly if simple or using a suitable module. Given your purpose, you may want to look into HTML::Template. Also see this post, for example.
Full parsing of PDF may be infeasible, but if the files aren't too complex it should work.
If your process for selecting keywords and building statistics is fairly common, there are integrated tools for document management (search for bibliography managers). However, I think that most of them resort to external tools to parse pdf so you may still be better off with your own script.
Actually I know that there is Test::Valgrind::Parser::XML perl module. But I have no idea how to use it: If anyone can provide documentation it would be great.
The valgrind docs show that valgrind accepts a --xml=yes tag to output messages as XML. The format of the XML is specified in the docs/internals/xml-output-protocol4.txt inside the source code repository.
With that, you can use any XML parser and do whatever you want with the data.
I need some help with conversion of DocBook files to Microsoft Word files.
Do I need an XSL file for the transformation?
Yes, you do need an XSL file. You can get XSL files for DocBook from the free DocBook XML distribution. Then, you run a free XSLT transformer such as Saxon. If you run Saxon from a command line, you give it the name of your DocBook file, and the name of one of the stylesheets, and it will transform your file according to the rules in the stylesheet.
What you need to do to transform to Word, is to pick the right stylesheet.
From DocBook XSL: The Complete Guide, here are three possibilities:
Convert to XSL-FO and then use the XMLmind to export to Word. See the XMLmind website for more information.
Use a limited set of tags and then use one of DocBook XML's included stylesheets to output to WordML.
Try to use Jfor to output to RTF, although Jfor no longer appears to be maintained.
And I have one of my own:
As above, use one of DocBook XML's included stylesheets to publish to XSL-FO, then run Apache FOP to convert from XSL-FO to RTF. You will lose the structural information, but you will keep a certain amount of the formatting.
I recently implemented same feature for our users. They use Oxygen XML editor that allows for easy transformations via XSL. I was going to do OOXML but settled on WordML. As a starting point I used roundtrip XSL, but I had to rewrite lots of templates because of existing bugs or just missing functionality. In addition, I did other customization to serve a purpose or for our XML file only.
I would not mind contributing back to the project, but don't really know how to get about it.
I know this is an 11 years old question. But now, in 2022 you can use pandoc to convert DocBook to MS Word (docx).
pandoc --from docbook --to docx --output filename.docx filename.docbook
I am using XQuery to transform DocBook into various formats using XQuery typeswitch library. XQuery uses indexes so I can transform many documents very quickly.