Write custom LateXML rule for Latex package ledmac - latexml

I want to convert a Latex file to XML using LateXML (version 0.8.2).
As LateXML does not (yet) have conversion rules for the package ledmac, I tried to
write them myself, but so far I am failing.
A typical usecase of ledmac would be something like
\edtext{some text}{\lemma{text}\Afootnote{some explanation}}
I would like to get as a result
<edtext>
some text
</edtext>
<lemma>
text
</lemma>
<Afootnote>
some explanation
</Afootnote>
The file test1.tex:
\documentclass{article}
\usepackage{ledmac}
\begin{document}
\section{First Section}
Et et possimus non delectus et id dolor fugiat. Nemo aperiam in vitae atque.
Qui \edtext{pariatur ut fuga}{\lemma{fuga} \Afootnote{quaerat ratione dolorem ex dicta}}.
Sapiente hic est nobis eum. Rerum odit consectetur possimus et quaerat aut veritatis quae.
\end{document}
My attempt so far has been to write a file ledmac.sty.ltxml in the same directory as test1.tex.
It has the following content:
package LaTeXML::Package::Pool;
use strict;
use LaTeXML::Package;
use warnings;
#RegisterNamespace('scr'=>"http://bla.com/scr");
DefConstructor('\edtext{}', "<edtext>#1</edtext>");
DefConstructor('\lemma{}', "<lemma>#1</lemma>");
DefConstructor('\Afootnote{}',"<Afootnote>#1</Afootnote>");
Tag('edtext', autoClose=>1);
Tag('lemma', autoClose=>1);
Tag('Afootnote', autoClose=>1);
1;
However, this results in six errors of the form
Error:malformed:edtext <edtext> isn't allowed here
at /media/sf_Dokumente/Leibniz/LeibnizFiles/LateXML-Leibniz/test1.tex; line 6 col 3
Currently in XMLDocument[54493616]<ltx:document _box="LaTeXML::Core::Whatsit=HASH(0x..." _font="Font[serif,medium,upright,10,b..." xmlns="http://dlmf.nist.gov/LaTeXML">...<ltx:section _box="LaTeXML::Core::Whatsit=HASH(0x..." _font="Font[serif,medium,upright,10,b..." refnum="1" xml:id="S1" _ID_counter_p_="1">...<ltx:para _box="LaTeXML::Core::Box=ARRAY(0x344f300)" _font="Font[serif,medium,upright,10,b..." xml:id="S1.p1">...<ltx:p _box="LaTeXML::Core::Box=ARRAY(0x344f300)" _font="Font[serif,medium,upright,10,b...">...
In Core::Document[#0x34474b8] at /media/sf_Dokumente/Leibniz/LeibnizFiles/LateXML-Leibniz/test1.tex; line 6 col 3
<= Core::Definition::Constructor[\edtext... <= Whatsit[\edtext,Core::List[Box[p],Box... <= Core::Document[#0x34474b8] <= ...
Error:malformed:edtext Attempt to close </edtext>, which isn't open
at /media/sf_Dokumente/Leibniz/LeibnizFiles/LateXML-Leibniz/test1.tex; line 6 col 3
Currently in XMLDocument[54493616]<ltx:document _box="LaTeXML::Core::Whatsit=HASH(0x..." _font="Font[serif,medium,upright,10,b..." xmlns="http://dlmf.nist.gov/LaTeXML">...<ltx:section _box="LaTeXML::Core::Whatsit=HASH(0x..." _font="Font[serif,medium,upright,10,b..." refnum="1" xml:id="S1" _ID_counter_p_="1">...<ltx:para _box="LaTeXML::Core::Box=ARRAY(0x344f300)" _font="Font[serif,medium,upright,10,b..." xml:id="S1.p1">...<ltx:p _box="LaTeXML::Core::Box=ARRAY(0x344f300)" _font="Font[serif,medium,upright,10,b...">...
In Core::Document[#0x34474b8] at /media/sf_Dokumente/Leibniz/LeibnizFiles/LateXML-Leibniz/test1.tex; line 6 col 3
<= Core::Definition::Constructor[\edtext... <= Whatsit[\edtext,Core::List[Box[p],Box... <= Core::Document[#0x34474b8] <= ...
These errors occur for each of the commands I try to convert.
If someone with LateXML expertise could give me a hint, how to proceed from here, I would be really grateful!
EDIT: After posting my question in the LateXML mailing list, I was told that a RelaxNG scheme has to be provided to LateXML. So this seems to be the reason for the error messages.

After consulting the LateXML mailing list, the developer gave me the hint that I need to specify a RelaxNG scheme, which basically specifies the structure of the resulting output file.

Related

How do I put the following LaTeX code in org-mode?

I want to customize my title of the article class to align to left rather than defaultt centering. I found a solution to do it in LaTeX StackExchange TeX.
The code is:
\makeatletter
\renewcommand{\maketitle}{\bgroup\setlength{\parindent}{0pt}
\begin{flushleft}
\textbf{\#title}
\#author
\end{flushleft}\egroup
}
\makeatother
\title{Neque porro quisquam est qui dolorem ipsum quia dolor sit amet, consectetur, adipisci velit}
\date{}
\author{%
Author One$^{1}$, Author2$^{2}$\\
$^{1}$Institution 1, I 1 City, I 1 Country\\
$^{2}$Institution 1, I 1 City, I 1 Country\\
\underline{$^{1}$mail#domain.com}\\
\underline{$^{2}$mail2#domain.com }
}
I want to incorporate this into my org file which will be exported to pdf using LaTeX. How can I do that?
The easiest way is to prepend each of the lines from \makeatletter to \makeatother, inclusive, with #+latex_header: and include these in your org file.

powershell breaks at 180 characters

I have a file, lorem.txt, with the following
Lorem ipsum dolor sit amet, latine apeirian molestiae eieum. Reque clita laudem at quo, dicat petentium disputationi eum an. His cu latine adolescens, agam diceret senserit sea at, utinam utamur salutatus eos cu. Solum fuisset suscipit eu ius. Qui tollit ubique conceptam cu, ne his nullam postulant.
https://github.com/joewalnes/websocketd
UPDATED
When i run the following using websocketd, it breaks the string at 80 characters:
websocketd.exe powershell "Get-Content lorem.txt"
the websocket frames are chopped into 80 character chunks. There any way I can force the powershell command to not use the default 80 character width?
Thanks
> sends the standard output to the file. This output is probably limited by the size of the console(powershell) window. To store output from powershell, you should use a powershell cmdlet like Out-File. Ex:
powershell "Get-Content lorem.txt | Out-File out.txt"
Out-File also has a -Width parameter where you could specify the character-limit for eac line.

Dymo LabelManager PnP Postscript file paperSize

Here I am again with a PostScript question.
This time I'm trying to print this simple file:
%!PS-Adobe-3.0
%%DocumentData: Clean7Bit
%%Requirements: color
%%LanguageLevel: 2
% 72 PostScript points = 1 inch
% 1 inch = 2.54 cm
%%PageBoundingBox: 8 8 300 36
<< /PageSize [300 36] >> setpagedevice
<< /Orientation 3>> setpagedevice
/ArialBold findfont 10 scalefont setfont
30 18 moveto (Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet Lorem ipsum dolor sit amet) show
showpage
As you can see it's only a single line, the label I want to print to is very small too, it's 12 mm wide and as long as you need (normally).
I'm getting the text to print correctly onto the label, but only up to the 2nd "amet". After that one the printer stops and says he is done, and he always prints exactly the same length (9cm) no mather what page size I set.
Anyone has any idea what I'm doing wrong?
The command I use to print is:
C:\gs\gswin32c.exe -sDEVICE=mswinpr2 -dNoCancel -dNOPAUSE -dSAFER -sOutputFile="%%printer%%DYMO LabelManager PnP" -q "C:\gs\dymo.ps" -c quit
I would always start by using the display device. If you remove the -sDEVICE=mswinpr2 Ghostscript will open a window and show you what it thinks the result will be, if that is incorrect then the mswinpr2 isn't going to be better. Try that and see what happens, this will also show you if you are falling off the media as the GS window is only white where the media is set.

Simple String template replacement in Scala and Clojure

Below are functions written in Scala and Clojure to do simple replacement of templates in Strings. The input to each function is a String containing templates of the form {key} and a map from Symbol/Keyword to replacement value.
For example:
Scala:
replaceTemplates("This is a {test}", Map('test -> "game"))
Clojure:
(replace-templates "This is a {test}" {:test "game"})
will return "This is a game".
The input map uses Symbols/Keywords so that I don't have to deal with corner cases where the templates in the Strings contain braces.
Unfortunately, the algorithm is not very efficient.
Here is the Scala code:
def replaceTemplates(text: String,
templates: Map[Symbol, String]): String = {
val builder = new StringBuilder(text)
#tailrec
def loop(key: String,
keyLength: Int,
value: String): StringBuilder = {
val index = builder.lastIndexOf(key)
if (index < 0) builder
else {
builder.replace(index, index + keyLength, value)
loop(key, keyLength, value)
}
}
templates.foreach {
case (key, value) =>
val template = "{" + key.name + "}"
loop(template, template.length, value)
}
builder.toString
}
and here is the Clojure code:
(defn replace-templates
"Return a String with each occurrence of a substring of the form {key}
replaced with the corresponding value from a map parameter.
#param str the String in which to do the replacements
#param m a map of keyword->value"
[text m]
(let [sb (StringBuilder. text)]
(letfn [(replace-all [key key-length value]
(let [index (.lastIndexOf sb key)]
(if (< index 0)
sb
(do
(.replace sb index (+ index key-length) value)
(recur key key-length value)))))]
(doseq [[key value] m]
(let [template (str "{" (name key) "}")]
(replace-all template (count template) value))))
(.toString sb)))
Here is a test case (Scala code):
replaceTemplates("""
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque
elit nisi, egestas et tincidunt eget, {foo} mattis non erat. Aenean ut
elit in odio vehicula facilisis. Vestibulum quis elit vel nulla
interdum facilisis ut eu sapien. Nullam cursus fermentum
sollicitudin. Donec non congue augue. {bar} Vestibulum et magna quis
arcu ultricies consectetur auctor vitae urna. Fusce hendrerit
facilisis volutpat. Ut lectus augue, mattis {baz} venenatis {foo}
lobortis sed, varius eu massa. Ut sit amet nunc quis velit hendrerit
bibendum in eget nibh. Cras blandit nibh in odio suscipit eget aliquet
tortor placerat. In tempor ullamcorper mi. Quisque egestas, metus eu
venenatis pulvinar, sem urna blandit mi, in lobortis augue sem ut
dolor. Sed in {bar} neque sapien, vitae lacinia arcu. Phasellus mollis
blandit commodo.
""", Map('foo -> "HELLO", 'bar -> "GOODBYE", 'baz -> "FORTY-TWO"))
and the output:
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque
elit nisi, egestas et tincidunt eget, HELLO mattis non erat. Aenean ut
elit in odio vehicula facilisis. Vestibulum quis elit vel nulla
interdum facilisis ut eu sapien. Nullam cursus fermentum
sollicitudin. Donec non congue augue. GOODBYE Vestibulum et magna quis
arcu ultricies consectetur auctor vitae urna. Fusce hendrerit
facilisis volutpat. Ut lectus augue, mattis FORTY-TWO venenatis HELLO
lobortis sed, varius eu massa. Ut sit amet nunc quis velit hendrerit
bibendum in eget nibh. Cras blandit nibh in odio suscipit eget aliquet
tortor placerat. In tempor ullamcorper mi. Quisque egestas, metus eu
venenatis pulvinar, sem urna blandit mi, in lobortis augue sem ut
dolor. Sed in GOODBYE neque sapien, vitae lacinia arcu. Phasellus mollis
blandit commodo.
The algorithm transverses the input map and for each pair, does a replacement in the input String, temporarily held in a StringBuilder. For each key/value pair, we search for the last occurrence of the key (enclosed in braces) and replace it with the value, until there are no more occurrences.
Does it make any performance difference if we use .lastIndexOf versus .indexOf in the StringBuilder?
How can the algorithm be improved? Is there a more idiomatic way to write the Scala and/or Clojure code?
UPDATE: See my follow-up.
UPDATE 2: Here is a better Scala implementation; O(n) in the length of the String. Note that I modified the Map to be [String, String] instead of [Symbol, String] on the recommendation of several people. (thanks mikera, kotarak):
/**
* Replace templates of the form {key} in the input String with values from the Map.
*
* #param text the String in which to do the replacements
* #param templates a Map from Symbol (key) to value
* #returns the String with all occurrences of the templates replaced by their values
*/
def replaceTemplates(text: String,
templates: Map[String, String]): String = {
val builder = new StringBuilder
val textLength = text.length
#tailrec
def loop(text: String): String = {
if (text.length == 0) builder.toString
else if (text.startsWith("{")) {
val brace = text.indexOf("}")
if (brace < 0) builder.append(text).toString
else {
val replacement = templates.get(text.substring(1, brace)).orNull
if (replacement != null) {
builder.append(replacement)
loop(text.substring(brace + 1))
} else {
builder.append("{")
loop(text.substring(1))
}
}
} else {
val brace = text.indexOf("{")
if (brace < 0) builder.append(text).toString
else {
builder.append(text.substring(0, brace))
loop(text.substring(brace))
}
}
}
loop(text)
}
UPDATE 3: Here are a set of Clojure test cases (Scala versions are left as an exercise :-)):
(use 'clojure.test)
(deftest test-replace-templates
(is (= ; No templates
(replace-templates "this is a test" {:foo "FOO"})
"this is a test"))
(is (= ; One simple template
(replace-templates "this is a {foo} test" {:foo "FOO"})
"this is a FOO test"))
(is (= ; Two templates, second at end of input string
(replace-templates "this is a {foo} test {bar}" {:foo "FOO" :bar "BAR"})
"this is a FOO test BAR"))
(is (= ; Two templates
(replace-templates "this is a {foo} test {bar} 42" {:foo "FOO" :bar "BAR"})
"this is a FOO test BAR 42"))
(is (= ; Second brace-enclosed item is NOT a template
(replace-templates "this is a {foo} test {baz} 42" {:foo "FOO" :bar "BAR"})
"this is a FOO test {baz} 42"))
(is (= ; Second item is not a template (no closing brace)
(replace-templates "this is a {foo} test {bar" {:foo "FOO" :bar "BAR"})
"this is a FOO test {bar"))
(is (= ; First item is enclosed in a non-template brace-pair
(replace-templates "this is {a {foo} test} {bar" {:foo "FOO" :bar "BAR"})
"this is {a FOO test} {bar")))
(run-tests)
I think the best algorithm you can build is O(n) in the length of the input string and would go something like:
Initialise an empty StringBuilder
Scan the string to find the first "{", add any substring prior to this to your Stringbuilder. If no "{" found, you're finished!
Scan until the next "}". Use whatever is in between the curly braces to do a map lookup in a String->String hashmap and add the result to your StringBuilder
Go back to 2. and continue scanning from after the "}"
Converting to Scala/Clojure left as an exercise :-)
I wrote a string interpolation library for Clojure that was brought into clojure-contrib as clojure.contrib.strint. I blogged about it; you'll find a description of the approach there. The most recent source for it can be viewed here on github. The big difference between clojure.contrib.strint and the approaches here is that the latter all perform the interpolation at runtime. In my experience, runtime interpolation is largely unnecessary, and using something like clojure.contrib.strint that performs the interpolation at compile-time often yields tangible performance benefits for your application.
Note that clojure.contrib.strint will hopefully be migrating to clojure.core.strint under Clojure's "new-contrib" organization.
Here's a version of the clojure implementation using regex to do the replacements. It's faster than your version (running your Lorum ipsum test case 100 times, see further down), and there's less code to maintain:
(defn replace-templates2 [text m]
(clojure.string/replace text
#"\{\w+\}"
(fn [groups]
((keyword (subs groups
1
(dec (.length groups)))) m))))
The implementation is quick and dirty, but it works. The point is I think you should solve this using regular expressions.
Update:
Experimented a bit with a funky way to do the substringing, and got a surprising performance result. Here's the code:
(defn replace-templates3 [text m]
(clojure.string/replace text
#"\{\w+\}"
(fn [groups]
((->> groups
reverse
(drop 1)
reverse
(drop 1)
(apply str)
keyword) m))))
And here are the results on my machine for your version, my first version, and finally this version (100 iterations):
"Elapsed time: 77.475072 msecs"
"Elapsed time: 50.238911 msecs"
"Elapsed time: 38.109875 msecs"
Some people, when faced with a problem, think "I'll use regex!". Now they have two problems. Others, however, decide not to use regex -- and now they have three problems: implementing and maintaining an ad hoc implementation of half regex, plus the other two.
At any rate, consider this:
import scala.util.matching.Regex
def replaceTemplates(text: String,
templates: Map[String, String]): String =
"""\{([^{}]*)\}""".r replaceSomeIn ( text, { case Regex.Groups(name) => templates get name } )
It uses string builder to search and replace. The map is using String instead of Symbol because it is faster that way, and the code doesn't replace matches that do not have a valid mapping. Using replaceAllIn would avoid that, but would require some type annotation because that method is overloaded.
You might want to browse Scala's source code from the scaladoc API for Regex, and see what's going on.
Torbjørns answer is very nice and readable. It might be nice to use butlast to get rid of the double reverse, as well as string/join instead of apply'ing str. In addition use the map as a function.
So the clojure code could be further shortened to:
(defn replace-template [text m]
(clojure.string/replace text #"\{\w+\}"
(comp m keyword clojure.string/join butlast rest)))
I don't know Clojure, so I can only speek for Scala:
The foreach-loop is slow because you iterate through the whole String in each loop cycle. This can be improved by searching the templates first and replace them secondly. Furthermore the data should always appended to the StringBuilder. That's because each time something is replaced inside the StringBuilder internally the new contents and the end of the StringBuilder are copied to a new Array of Chars.
def replaceTemplates(s: String, templates: Map[String, String]): String = {
type DataList = List[(Int, String, Int)]
def matchedData(from: Int, l: DataList): DataList = {
val end = s.lastIndexOf("}", from)
if (end == -1) l
else {
val begin = s.lastIndexOf("{", end)
if (begin == -1) l
else {
val template = s.substring(begin, end+1)
matchedData(begin-1, (begin, template, end+1) :: l)
}
}
}
val sb = new StringBuilder(s.length)
var prev = 0
for ((begin, template, end) <- matchedData(s.length, Nil)) {
sb.append(s.substring(prev, begin))
val ident = template.substring(1, template.length-1)
sb.append(templates.getOrElse(ident, template))
prev = end
}
sb.append(s.substring(prev, s.length))
sb.toString
}
Or with RegEx (shorter but slower):
def replaceTemplates(s: String, templates: Map[String, String]): String = {
val sb = new StringBuilder(s.length)
var prev = 0
for (m <- """\{.+?\}""".r findAllIn s matchData) {
sb.append(s.substring(prev, m.start))
val ms = m.matched
val ident = ms.substring(1, ms.length-1)
sb.append(templates.getOrElse(ident, ms))
prev = m.end
}
sb.append(s.substring(prev, s.length))
sb.toString
}
Regex + replaceAllIn + Fold:
val template = "Hello #{name}!"
val replacements = Map( "name" -> "Aldo" )
replacements.foldLeft(template)((s:String, x:(String,String)) => ( "#\\{" + x._1 + "\\}" ).r.replaceAllIn( s, x._2 ))

Does Perl6 support something equivalent to Perl5's __DATA__ and __END__ sections?

Does perl6/Rakudo have something equivalent to perl5's __DATA__ or __END__ sections?
Quote S26:
Named Perldoc blocks whose typename is
DATA are the Perl 6 equivalent of the
Perl 5 __DATA__ section. The
difference is that =DATA blocks are
just regular Pod blocks and may appear
anywhere within a source file, and as
many times as required. Synopsis 2
describes the new Perl 6 interface for
inline data.
In theory you should be able to do something like this (somebody please fix the syntax if it’s off):
use v6;
=begin DATA
Foo
=end DATA
say #=DATA;
In practice it seems that Rakudo does not support that, yet.
To carefully selectively quote the current S02 design document:
There is no longer any special DATA stream--any Pod block in the
current file can be accessed via a Pod object ...
You have to split [Pod block] contents into lines yourself.
[Speculative] It may also be possible to treat a Pod object as an
IO::Handle, to read the Pod information line-by-line (like the DATA
filehandle in Perl 5, but for any Pod block).
So, instead of the single DATA section per file which you access by reading a filehandle, you define any number of Pod blocks in your script file; they're stored in the $=pod variable at compile time; you read from that variable; and the ones called 'data' are the equivalents of Perl 5's DATA.
This works today. I'll show that in a moment. But first I need to talk about stuff that does not work today.
The quoting above was highly selective. The elided text talked about P6 automatically creating a variable with a name of the form $=foo corresponding to Pod blocks with the name 'foo'. This is a general still unimplemented feature of Pod blocks, not just data blocks.
The "data block" section of the Pod design doc S26 talks about data blocks doing some fancier stuff than plain old Pod blocks. This has not yet been implemented either.
So, now let's move on to what can be done today:
=foo This is a Pod block. A single line one. This Pod block's name is 'foo'.
=begin qux
This is another syntax for defining a Pod block.
It allows for multi line content.
This block's name is 'qux'.
=end qux
=data A data block -- a Pod block with the name 'data'.
# Data blocks are P6's version of P5's __DATA__.
# But you can have multiple data blocks:
=begin data
Another data block.
This time a multi line one.
=end data
$=pod.grep(*.name eq 'data').map(*.contents[0].contents.say);
This prints:
A data block -- a Pod block with the name 'data'.
Another data block. This time a multi line one.
So, it sorta works. But it clearly needs a lot more sugar.
By the way, if the last FP style line didn't make sense, here's an imperative equivalent:
for #$=pod {
if .name eq 'data' {
say .contents[0].contents
}
};
As a work-around until this gets fully implemented, you can use heredocs.
for data().lines -> $line {
put $line;
}
sub data {
return q:to/END/;
Foo, bar, baz
1, 2, 3
END
}
Outputs
Foo, bar, baz
1, 2, 3
To get an array of data, while putting the data at the bottom of the program to help with readability, here is an variation of #Christopher Bottoms answer:
my #txts = data();
dd #txts;
# this works too
my %stuff = hashdata();
dd %stuff;
# a lot of lines
sub data() {
return ( q:to/LINE1/,
Phasellus dictum, nunc id vestibulum rhoncus, mauris massa tempus nibh,
nec tincidunt nisi tellus et arcu. Phasellus vulputate consectetur
vulputate. Quisque viverra commodo velit ac tincidunt. Nulla et est sem.
Mauris gravida, nulla rutrum pharetra dapibus, eros velit feugiat nibh,
nec iaculis purus urna ut diam. Praesent molestie felis a turpis gravida
placerat. Duis sagittis pulvinar risus non aliquet. Nunc quis purus
tempor, mattis nunc eu, porta ligula. Suspendisse dictum sit amet urna
dapibus suscipit.
LINE1
q:to/LINE2/,
Praesent molestie felis a turpis gravida
placerat. Duis sagittis pulvinar risus non aliquet. Nunc quis purus
tempor, mattis nunc eu, porta ligula. Suspendisse dictum sit amet urna
dapibus suscipit.
LINE2
q:to/LINE3/);
Quisque viverra commodo velit ac tincidunt. Nulla et est sem.
Mauris gravida, nulla rutrum pharetra dapibus, eros velit feugiat nibh,
nec iaculis purus urna ut diam. Praesent molestie felis a turpis gravida
placerat.
LINE3
}
sub hashdata() { # a hash works too.
return ( 'p' => q:to/PDATA/,
Some multiline data
in some lines
PDATA
'q' => q:to/QDATA/,
More data in
multiple lines
QDATA
'r' => q:to/RDATA/
Note that indentation depends on the position of the
ending token.
Also, the punctuation following the regex is the punctuation
following the expression. So a comma after each of the
p and q, but not needed after the r
RDATA
)
}