What does the A in ~A in format stand for? - lisp

With format you can use, among other things, ~S and ~A.
While the S in ~S is for S-expression, what does the A in ~A stand for? Apparently it outputs without escaping, but I was wondering what the letter actually means…?

It stands for Aesthetic. A-formatted output is not escaped. See http://www.lispworks.com/documentation/HyperSpec/Body/22_cda.htm
Incidentally, S stands for Standard rather than S-expression.

Related

What is the syntax, if any, of unicode characters in Common Lisp?

Are there any syntax for unicode characters in Common Lisp? Like \u03B1 in Java?
Maybe something like #\U+03B1, or something similar?
The uber-portable way is #.(code-char X) which will produce the
Unicode char with the given numeric code X (provided that the implementation actually uses Unicode - which the ANSI standard does not require - and, indeed, all implementations that go beyond ASCII - which is not mandated either! - do use Unicode).
If you know the Unicode name of the character, you can also use the #\ syntax:
(char= (code-char 12345) #\HANGZHOU_NUMERAL_TWENTY)
T
Implementations often define additional Unicode character syntax, e.g.:
#\Code<decimal> in CLISP.
#\U+0<hex> in SBCL.
See:
code-char
*read-eval*
Sharpsign Dot

What's the semantic difference between the backtick and quote symbols in Common Lisp?

I understand that both suppress evaluation of a symbol or expression. But the backtick is used for macro definitions while the apostrophe is used for symbols (among other things). What is the difference, semantically speaking, between these two notations?
Backticks allow for ,foo and ,#foo to interpolate dynamic parts into the quoted expression.
' straight up quotes everything literally.
If there are no comma parts in the expression, ` and ' can be used interchangeably.
A standard quote is a true constant literal and similar lists and list that end with the same structure can share values:
'(a b c d) ; ==> (a b c d)
A backquoted structure might not be a literal. It is evaluated as every unquote needs to be evaluated and inserted into place. This means that something like `(a ,#b ,c d) actually gets expanded to something similar to (cons 'a (append b (cons c '(d)))).
The standard is very flexible on how the implementations solves this so if you try to macroexpand the expression you get many different solutions and sometimes internal functions. The result though is well explained in the standard.
NB: Even though two separate evaluation produces different values the implementation is still free to share structure and thus in my example '(d) has the potential to be shared and if one would use mutating concatenation of the result might end up with an infinite structure.
A parallel to this is that in some algol languages you have two types of strings. One that interpolates variables and one that don't. Eg. in PHP
"Hello $var"; // ==> 'Hello Shoblade'
'Hello $var'; // ==> 'Hello $var'

terpri, princ & co. vs format

Chapter 9.10 of Common Lisp: A Gentle Introduction To Symbolic Computation claims:
The primitive i/o functions TERPRI, PRIN1, PRINC and PRINT were defined in Lisp 1.5 (the ancestor of all modern Lisp systems) and are still found in Common Lisp today. They are included in the Advanced Topics section as a historical note; you can get the same effect with FORMAT.
This implies that you do not neet princ & co. any more and that, in modern code, you only should rely on format instead.
Are there any disadvantages when doing this? Respectively, are there any things one can not achieve with format that works with the other ones?
These functions correspond exactly to the following FORMAT operators:
TERPRI = ~%
FRESH-LINT = ~&
PRIN1 = ~S
PRINC = ~A
PRINT = ~%~S<space>
You can also use the more modern write. I'm not a huge fan of format because of its terse sub language, which usually is interpreted. Note that a good implementation might be able to compile format directives to more efficient code. I use FORMAT mostly when it makes complex code shorter, but not to output plain objects or things like single carriage returns...
Common Lisp includes three or more generations of text I/O APIs:
the old s-expression printing routines
the specialized and generalized stream IO functions
the complex formatter, based on earlier Fortran and/or Multics IO formatters
the Generic Function to print objects
the pretty printer
Additionally there are semi-standard CLOS-based IO implementations like Gray Streams.
Each might have its purpose and none is going away soon...
CL-USER 54 > (let ((label "Social security number")
(colon ": ")
(social-security-number '|7537 DD 459234957324 DE|))
(terpri)
(princ label)
(princ colon)
(princ social-security-number)
(write-char #\newline)
(write-string label)
(write-string colon)
(write social-security-number :escape nil)
(format t "~%~A~A~A" label colon social-security-number)
)
Social security number: 7537 DD 459234957324 DE
Social security number: 7537 DD 459234957324 DE
Social security number: 7537 DD 459234957324 DE

Using Emoji literals in Clojure source

On Linux with UTF-8 enabled console:
Clojure 1.6.0
user=> (def c \の)
#'user/c
user=> (str c)
"の"
user=> (def c \🍒)
RuntimeException Unsupported character: \🍒 clojure.lang.Util.runtimeException (Util.java:221)
RuntimeException Unmatched delimiter: ) clojure.lang.Util.runtimeException (Util.java:221)
I was hoping to have an emoji-rich Clojure application with little effort, but it appears I will be looking up and typing in emoji codes? Or am I missing something obvious here? 😞
Java represents Unicode characters in UTF-16. The emoji characters are "supplementary characters" and have a codepoint that cannot be represented in 16 bits.
http://www.oracle.com/technetwork/articles/javase/supplementary-142654.html
In essence, supplementary characters are represented not as chars but as ints and there are special apis for dealing with them.
One way is with (Character/toChars 128516) - this returns a char array that you can convert to a string to print: (apply str (Character/toChars 128516)). Or you can create a String from an array of codepoint ints directly with (String. (int-array [128516]) 0 1). Depending on all the various things between Java/Clojure and your eyeballs, that may or may not do what you want.
The format api supports supplementary characters so that may be easiest, however it takes an int so you'll need a cast: (format "Smile! %c" (int 128516)).
Thanks to Clojure’s extensible reader tags, you can create Unicode literals quite easily yourself.
We already know that not all of Unicode can be represented as char literals; that the preferred representation of Unicode characters on the JVM is int; and that a string literal can hold any Unicode character in a way that’s also convenient for humans to read.
So, a tagged literal #u "🍒" that reads as an int would make an excellent Unicode character literal!
Set up a reader function for the new tagged literal in *data-readers*:
(defn read-codepoint
[^String s]
{:pre [(= 1 (.codePointCount s 0 (.length s)))]}
(.codePointAt s 0))
(set! *data-readers* (assoc *data-readers* 'u #'read-codepoint))
With that in place, the reader reads such literals as code point integers:
#u"🍒" ; => 127826
(Character/getName #u"🍒") ; => "CHERRIES"
‘Reader tags without namespace qualifiers are reserved for Clojure’, says the documentation … #u is short but perhaps not the most responsible choice.

How to escape double quote?

In org mode, if I want to format text a monospace verbatim, i.e. ~...~, if it is inside quotes: ~"..."~, it is not formatted (left as is).
Also, are quotes a reserved symbol, if so, what do they mean? (they don't seem to affect the generated HTML / inside Emacs display).
The culprit in this case is the regular expression in org-emph-re org-verbatim-re, responsible for determining if a sequence of characters in the document is to be set verbatim or not.
org-verbatim-re is a variable defined in `org.el'.
Its value is
"\([ ('\"{]\|^\)\(\([=~]\)\([^
\n,\"']\|[^
\n,\"'].?\(?:\n.?\)\{0,1\}[^
\n,\"']\)\3\)\([- .,:!?;'\")}\]\|$\)"
quotes and double quotes are explicitly forbidden inside verbatim characters =~ by
[^
\n,\"']\|[^
\n,\"']
I found discussions dating back 3 years comming to the conclusion that you have to tinker with this regular expression and set the variable org-emph-re/org-verbatim-re to something that matches your wishes in your emacs setup (maybe a file local variable works as well). You can experiment by excluding double quotes from the excluding character classes and outside matches as in
"\([ ('{]\|^\)\(\([*/_=~+]\)\([^
\n,']\|[^
\n,'].?\(?:\n.?\)\{0,1\}[^
\n,']\)\3\)\([- .,:!?;')}\]\|$\)"
but looking at that regex, heaven knows what happens to complex documents -- you have to try...
Edit: as it happens, if I evalute the following as region, quotes inside = are exported correctly, but nothing else is :-), I investigate further when I have more time.
(setq org-emph-re "\([ ('{]\|^\)\(\([*/_=~+]\)\([^
\n,']\|[^
\n,'].?\(?:\n.?\)\{0,1\}[^
\n,']\)\3\)\([- .,:!?;')}]\|$\)")
Edit 2:: Got it to work by changing org.el directly:
Change the line following (defvar org-emphasis-regexp-components from '(" \t('\"{" "- \t.,:!?;'\")}\\" " \t\r\n,\"'" "." 1) to '(" \t('{" "- \t.,:!?;')}\\" " \t\r\n,'" "." 1) and recompile org then restart emacs.
This was a defcustom prior to the 8.0 release, it isn't anymore, so you have to live with this manual modification.
regards,
Tom
Finally, I found a solution from http://comments.gmane.org/gmane.emacs.orgmode/82571
According to that thread, the regexp for verbatim is built from variable org-emphasis-regexp-components, which defines legal characters before, after, at the border of, or in the body of emphasis; and verbatim is one of the emphasis environment in org mode.
A workable setting given by that thread:
(setcar (nthcdr 2 org-emphasis-regexp-components) " \t\n,")
(custom-set-variables `(org-emphasis-alist ',org-emphasis-alist))
For small amounts of characters which have some unwanted effect in Emacs org-mode (because being metacharacters) it may be helpful to have a look at special symbols in org-mode (org-entities.el).
So for example " can be encoded by \quot{} (where the braces pair at the end is not mandatory, but needed if no whitespace follows).
So instead ="..."= you would write =\quot{}...\quot{}=.
That is some typing more and looks pretty ugly. But for the latter org-mode has a solution: by C-c C-x \ you can toggle a display magic for those symbols. If the magic is active, so directly after typing \quot{} resp. \quot{} a " will be displayed.
Besides, this symbols list can easily be extended, f.e.
(add-to-list 'org-entities
'("backslash" "\\textbackslash" nil "\\" "\\" "\\" "\\"))
Nevertheless I am heavily missing easier escaping in org-mode, besides the above solution and besides escaping a whole line by a : at its beginning.
I'd be happy if =verbatim= in all cases would leave the text between the ='s unchanged. Not =this*bold*text=, but =this *bold* text=. Like we know that from each well-designed markup/-down language.
But, of course, this is better placed at the org-mode development pages. Ideally with a fitting patch... :-)
I've met similar problem, and thanks #chaiko for a basic solution. However, #chaiko's solution only work for org-mode's fontification, it doesn't affect org-export. To get correct exported document, you need to do some more extra hack to org-mode's parser by (org-element--set-regexps).
So the full code snippets should be something like:
(setcar (nthcdr 2 org-emphasis-regexp-components) " \t\n\r")
(custom-set-variables `(org-emphasis-alist ',org-emphasis-alist))
(org-element--set-regexps)
I've integrated this to my oh-my-emacs project: https://github.com/xiaohanyu/oh-my-emacs/blob/e82fce10d47f7256df6d39e32ca288d0ec97a764/core/ome-org.org#code-block-fontification .