DocBook macros? - macros

Is there any way of defining macros (like tex macros o latex defines) in DocBook documents?
DocBook is very verbose, and macros would help a lot. I didn't find them
in quickstart tutorials.
If so, could anyone provide a simple example or a link to?
Thanks

Not sure, if this is exactly what you want / if it full fills your requirements, but I'm thinking of ENTITYs. You can define them at the top (of your XML document, so general XML, nothing DocBook specific). As seen here for the 'doc.release.number' and 'doc.release.date'. But they can also be included through an separate file. As seen in the 3th ENTITY row. Here the SYSTEM means, comming from another file 'entities.ent'.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd" [
<!ENTITY doc.release.number "1.0.0.beta-1" >
<!ENTITY doc.release.date "April 2010" >
<!ENTITY % entities SYSTEM "entities.ent" >
%entities;
]>
<!-- This document is based on http://readyset.tigris.org/nonav/templates/userguide.html -->
<article lang="en">
<articleinfo>
<title>&project.impl.title; - User Manual</title>
<subtitle></subtitle>
<date>&project.impl.release.date;</date>
<copyright>
<year>doc.release.year</year>
<holder>Team - &project.impl.title;</holder>
</copyright>
<releaseinfo>&doc.release.number;</releaseinfo>
</articleinfo>
<section>
<title>Introduction</title>
<para>
The &project.impl.title; has been created to clean up (X)HTML and XML documents as part of
</para>
<section>
</article>
In the document you reference the entities through a starting & and ending ; as in &project.impl.title;
In the file 'entities.ent' you specify the ENTITY elements in a similar way:
<?xml version="1.0" encoding="UTF-8"?>
<!ENTITY project.impl.title 'Maven Tidy Plug-in' >
<!ENTITY project.impl.group-id 'net.sourceforge.docbook-utils.maven-plugin' >
<!ENTITY project.impl.artifact-id 'maven-tidy-plugin' >
<!ENTITY project.impl.release.number '1.0.0.beta-1' >
<!ENTITY project.impl.release.date 'April 2010' >
<!ENTITY project.impl.release.year '2010' >
<!ENTITY project.impl.url '../' >
<!ENTITY project.spec.title '' >
<!ENTITY project.spec.release.number '' >
<!ENTITY project.spec.release.date '' >
<!ENTITY doc.release.year '2010' >

Not exactly what you asked for, but perhaps helpful for some of your cases: you can define templates in your wrapper stylesheet where you define fo commands. Some examples:
Code:
<xsl:template match="symbolchar">
<fo:inline font-family="Symbol">
<xsl:choose>
<xsl:when test=".='ge'">≥</xsl:when>
<xsl:when test=".='le'">≤</xsl:when>
<xsl:when test=".='sqrt'">√</xsl:when>
<xsl:otherwise>?!?</xsl:otherwise>
</xsl:choose>
</fo:inline>
</xsl:template>
Usage:
<symbolchar>le</symbolchar>
Code:
<xsl:template match="processing-instruction('linebreak')">
<fo:block/>
</xsl:template>
Usage:
<?linebreak?>

Have you considered generating DocBook from another format (like reStructuredText?)
I found it quite nice for documentation.
Also, you could probably write a macro preprocessor (or look into m4) pretty quickly. If you are using the XML version of DocBook, a simple XSLT will do. Just make up some tags and transform them. Have boilerplate stuff added automatically. And get ready to be really angry at XSLT. For not being all it could be. For making your thinking warp.

Related

VSCode: Delete all occurences of xml tag pair including differing contents

I'm working in a kml (xml) file in VSCode. There are 267 instances of the <description></description> tags with the same contents schema but different contents. I would like a fast way to delete all of the instances of <description> including the contents instead of manually deleting each one. I'm not married to VSCode if Notepad++ or another editor will do what I'm trying to do.
Use one command/macro to delete both of these (plus 265 more)
<description><![CDATA[<center><table><tr><th colspan='2' align='center'>
<em>Attributes</em></th></tr><tr bgcolor="#E3E3F3">
<th>NAME</th>
<td>Anderson</td>
</tr><tr bgcolor="#E3E3F3">
</tr></table></center>]]>
</description>
<description><![CDATA[<center><table><tr><th colspan='2' align='center'>
<em>Attributes</em></th></tr><tr bgcolor="#E3E3F3">
<th>NAME</th>
<td>Billingsly</td>
</tr><tr bgcolor="#F00000">
</tr></table></center>]]>
</description>
Thank you, Paul
You can use this regex in vscode find/replace:
\n?<description>[\S\s\n]*?<\/description>\n?
and replace with nothing. The \n?'s at the beginning and end are there if you want to delete the lines the tags occur on as well - see how it works, you can remove those if you don't care about empty lines where your deleted content used to be.
Obviously, if you have malformed input, like unmatched <description> or </description> tags the regex won't work.

SGML Named entity for U+1e7c (Ṽ) & U+1e7d (ṽ)?

I need to use the named entities for special character, but unable to find any thing for the two character U+1e7c (Ṽ) & U+1e7d (ṽ)?, i searched for it unable to find any where in the available lists online. kindly help.
I'm not sure if there are named entities for those characters. You can make your own though or just use either the hex (Ṽ and ṽ) or dec (Ṽ and ṽ) references.
Example of creating your own named entities:
<!ENTITY Vtilde "Ṽ">
<!ENTITY vtilde "ṽ">
Example usage:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd" [
<!ENTITY Vtilde "Ṽ">
<!ENTITY vtilde "ṽ">
]>
<html>
<head>
<title></title>
</head>
<body>
<p>Here is the uppercase V with a tilde char: "&Vtilde;".</p>
<p>Here is the lowercase v with a tilde char: "&vtilde;".</p>
</body>
</html>

XSLT Stylesheet works with a vbs script, but not Perl

I have a program that is essentially a search application, and it exists in both VBScript and Perl form (I'm trying to make an executable with a GUI).
Currently the search application outputs an HTML file, and if a section of text in the HTML is longer than twelve lines then it hides anything after that and includes a clickable More... tag.
This is done in XSLT and works with VBScript.
I literally copied and pasted the stylesheet into the Perl program that I'm using and it does everything right except for the More... tag.
Is there any reason why it would be working with the VBScript but not Perl?
I'm using XML::LibXSLT in the Perl script, and here is the template that is supposed to be creating the More... tag
<xsl:template name="more">
<xsl:param name="text"/>
<xsl:param name="row-id"/>
<xsl:param name="cycle" select="1"/>
<xsl:choose>
<xsl:when test="($cycle > 12) and contains($text,'
')">
<span class="show" onclick="showID('SHOW{$row-id}');style.display = 'none';">More...</span>
<span class="hidden" id="SHOW{$row-id}">
<xsl:call-template name="highlight">
<xsl:with-param name="text" select="$text"/>
</xsl:call-template>
</span>
</xsl:when>
<xsl:when test="contains($text,'
')">
<xsl:call-template name="highlight">
<xsl:with-param name="text" select="substring-before($text,'
')"/>
</xsl:call-template>
<xsl:text>
</xsl:text>
<xsl:call-template name="more">
<xsl:with-param name="text" select="substring-after($text,'
')"/>
<xsl:with-param name="row-id" select="$row-id"/>
<xsl:with-param name="cycle" select="$cycle + 1"/>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:call-template name="highlight">
<xsl:with-param name="text" select="$text"/>
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
I believe the problem is that your XSLT is searching the text for CR characters - 
.
Windows files use the CR LF character pair to terminate each line of text, and a version of Perl running on a Windows system will strip the CR to leave just the LF.
This provides an API compatible with Linux-like systems, which use just LF in the first place, but means your XSLT stylesheet doesn't find any CR characters when it looks for them.
I suggest you change your stylesheet to search instead for LF characters, which will be present at the end of every line regardless of the file's origin, and will be seen by both Perl and VBScript.
I think character codes are best expressed in hex, so you would change '
 to '
throughout your XSLT code.
Note
By making this change your strings will be left with a trailing CR after you use substring-before($text, '
'). You can either leave this in place — I don't think it will do any harm as it won't be rendered by a browser — or you can remove from the string that you pass to the more template when you call it
<xsl:call-template name="more">
<xsl:with-parameter name="text" value="translate($text, '
', '')"/>
...
</xsl:call-template>
That would leave the template with a clean string to process, containing no CR characters.

Handle DTD with Eclipse

I have built this DTD :
<!ELEMENT universes (universe+)>
<!ELEMENT universe (index,name,conf)>
<!ELEMENT index (#PCDATA)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT conf (speed,resources-cdr,moons,bots)>
<!ELEMENT speed (game,fleet,resources)>
<!ELEMENT game (#PCDATA)>
<!ELEMENT fleet (#PCDATA)>
<!ELEMENT resources (#PCDATA)>
<!ELEMENT resources-cdr (ships,defs) >
<!ELEMENT ships (#PCDATA)>
<!ELEMENT defs (#PCDATA)>
<!ELEMENT moons (#PCDATA)>
<!ELEMENT bots (#PCDATA)>
and I use it inside a xml file like this :
<!DOCTYPE universes SYSTEM "universes.dtd" >
Now under Eclipse (indigo) when I use CTRL+SPACE to see elements list, I see simple elements only (those #PCDATA) not others. See below :
In this case, I see index and name proposals but not conf proposal.
If I enter conf tag manually, not with wizard, I have similar problem with nested tags :
How can I modify this Eclipse behaviour please ?
Thank you
ok problem is solved.
In my case, I have created xml file before linking it with its DTD.
If I create new xml file with Eclipse and choose create XML file from a DTD it's work

How to display umlauts in chm Table of Contents?

I'm creating the german version of a chm help file. My problem is in Table of Contents umlauts are not displayed. I assume it is because of code page. The hhc file is ANSI. Converting it to Unicode doesn't help - it displays different, but still wrong, characters.
The file "Table of Contents.hhc" starts with
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<HTML>
<HEAD>
<meta name="GENERATOR" content="Microsoft® HTML Help Workshop 4.1">
<!-- Sitemap 1.0 -->
</HEAD><BODY>
<OBJECT type="text/site properties">
<param name="ImageType" value="Folder">
</OBJECT>
<UL>
<LI> <OBJECT type="text/sitemap">
<param name="Name" value="ÜÜÜÜÜÜÜÜÜÜÜÜÜÜ Uberblick">
<param name="Local" value="overview.htm">
<param name="URL" value="overview.htm">
</OBJECT>
</UL>
</BODY></HTML>
Make sure the "Language" setting in the "Options" section of the project file supports the character you want. Since you are on a Russian system, the default is probably Russian. Change it to German, for instance.
The engine rendering the chm is Unicode, only the compiler is ansi.
Try escaping them? http://www.w3schools.com/tags/ref_entities.asp
or the charset encoding:http://www.w3.org/TR/html4/charset.html#h-5.2.2
Actually, you don't need UTF-8 for CHM files because CHM doesn't support UTF-8 or Unicode. CHM is an ancient format that Microsoft has not really changed since Windows 98, and it has a number of quirks and restrictions like this
Read for more detail...
https://helpman.it-authoring.com/viewtopic.php?t=9294
https://blogs.msdn.microsoft.com/sandcastle/2007/09/29/chm-localization-and-unicode-issues-dbcsfix-exe/