Structure of X-expression in Racket - racket

Racket represents XML data as an X-expression: http://docs.racket-lang.org/xml/index.html?q=#(def._((lib._xml/private/xexpr-core..rkt)._xexpr~3f))
which is defined as follows:
xexpr = string
| (list symbol (list (list symbol string) ...) xexpr ...)
| (cons symbol (list xexpr ...))
| symbol
| valid-char?
| cdata
| misc
In the second alternative, why is it (list symbol string) and not (cons symbol string) ? Is there any specific reason to use list instead of cons ? If not, would there be any advantage in using cons instead of list ?

It's really just a preference on the part of the X-expression designers, probably to mirror let's syntax (which also uses proper lists only).
With (list symbol string), you'd represent Stack Overflow as:
(a ((href "http://stackoverflow.com/")) "Stack Overflow")
whereas with (cons symbol string), it'd be:
(a ((href . "http://stackoverflow.com/")) "Stack Overflow")
Some would consider the "dot" an ugly thing to see.

Related

What is the difference between #\ , ' and #'?

In Common Lisp, given that "a" is simply a character, what is the difference between #\a, 'a #'a?
My question comes from the tutorialspoint.com tutorial on Lisp. At one point the tutorial introduces:
; a character array with all initial elements set to a
; is a string actually
(write(make-array 10 :element-type 'character :initial-element #\a))
(terpri)
; a two dimensional array with initial values a
(setq myarray (make-array '(2 2) :initial-element 'a :adjustable t))
(write myarray)
(terpri)
With the output:
"aaaaaaaaaa"
#2A((A A) (A A))
#' is not included in this example but I'm including it in the question because it can be confusing as well. 🙂
Thank you very much! 😊
To start, a is not "simply a character." The Lisp reader parses #\a as the character literal a, which is an object in Common Lisp. Note that #\a and #\A are different character objects.
When the Lisp reader encounters a single quote, the expression following the single quote is not evaluated. Specifically, 'a is treated as (quote a), where quote returns its argument unevaluated. Now, a is a symbol, so 'a evaluates to that symbol. But the Lisp reader upcases most characters it reads by default, so 'a really evaluates to the symbol A. The good news is that whether you type a or A, the Lisp reader will read A (unless you mess with the readtable), and both 'a and 'A evaluate to the symbol A.
When the Lisp reader encounters #'a, the entire expression is treated as (function a), which when evaluated returns the function associated with the name a. But, note that it is an error to use function, and by extension #', on an identifier that does not denote a function.
To clarify this last part a bit, consider the following REPL interaction:
CL-USER> (defvar a 1)
A
CL-USER> a
1
CL-USER> #'a
The function COMMON-LISP-USER::A is undefined.
[Condition of type UNDEFINED-FUNCTION]
Here the variable a is defined and given the value 1, but when we try to access the function denoted by a we get an error message because there is no such function. Continuing:
; Evaluation aborted on #<UNDEFINED-FUNCTION A {1002DDC303}>.
CL-USER> (defun a (x) x)
A
CL-USER> (a 'b)
B
CL-USER> a
1
CL-USER> #'a
#<FUNCTION A>
Now we have defined a function named a that simply returns its argument. You can see that when we call a with an argument 'b we get the expected result: (a 'b) --> b. But, then when we evaluate a alone we still get 1. Symbols in Common Lisp are objects that have, among other cells, value cells and function cells. After the above interaction, the symbol a now has 1 in its value cell, and it has the function we have defined in its function cell. When the symbol a is evaluated the value cell is accessed, but when (function a) or #'a is evaluated, the function cell is accessed. You can see above that when #'a is evaluated, the function we defined is returned, and the REPL prints #<FUNCTION A> to show this.
As an aside, I wouldn't recommend using Tutorialspoint to learn Common Lisp. Glancing over the site, right away I see this:
LISP expressions are case-insensitive, cos 45 or COS 45 are same.
This is just wrong. And, Lisp is not written in all-caps. None of this inspires faith. Instead, find a good book. There are some recommendations on the common-lisp tag-info page.
#\
This is to introduce a character.
CL-USER> #\a
#\a
CL-USER> (character 'a)
#\A
CL-USER> (character "a")
#\a
'
This is quote, to quote and not evaluate things and construct object literals.
CL-USER> a
=> error: the variable a is unbound.
CL-USER> 'a
A
CL-USER> (inspect 'a)
The object is a SYMBOL.
0. Name: "A"
1. Package: #<PACKAGE "COMMON-LISP-USER">
2. Value: "unbound"
3. Function: "unbound"
4. Plist: NIL
> q
CL-USER> (equal (list 1 2) (quote (1 2))) ;; aka '(1 2)
T ;; but watch out with object literals constructed with quote, prefer constructor functions.
and #'
This is sharpsign-quote to reference a function.
CL-USER> #'a
=> error: The function COMMON-LISP-USER::A is undefined.
CL-USER> (defun a () (print "hello A"))
A
CL-USER> (a)
"hello A"
"hello A"
CL-USER> #'a
#<FUNCTION A>
CL-USER> (function a)
#<FUNCTION A>
One can ask Lisp to describe the data objects you've mentioned.
If we look at the expressions:
CL-USER 13 > (dolist (object (list '#\a ''a '#'a))
(terpri)
(describe object)
(terpri))
#\a is a CHARACTER
Name "Latin-Small-Letter-A"
Code 97
(QUOTE A) is a LIST
0 QUOTE
1 A
(FUNCTION A) is a LIST
0 FUNCTION
1 A
NIL
If we look at the evaluated expressions:
CL-USER 5 > (dolist (object (list #\a 'a #'a))
(terpri)
(describe object)
(terpri))
#\a is a CHARACTER
Name "Latin-Small-Letter-A"
Code 97
A is a SYMBOL
NAME "A"
VALUE #<unbound value>
FUNCTION #<interpreted function A 422005BD54>
PLIST NIL
PACKAGE #<The COMMON-LISP-USER package, 73/256 internal, 0/4 external>
#<interpreted function A 422005BD54> is a TYPE::INTERPRETED-FUNCTION
CODE (LAMBDA (B)
A)

Common Lisp: read each input character as a list element

Lisp newbie here.
I want to read from standard-in a string of characters such as:
aabc
I want to convert that input into a list, where each character becomes a list element:
(a a b c)
And I want the list assigned to a global variable, text.
I created this function:
(defun get-line ()
(setf text (read)))
but that just results in assigning a single symbol to text, not tokenizing the input into a list of symbols.
What's the right way to implement get-line() please?
Here you go: First using coerce to convert the string to a list of characters, then mapcar to convert each character to a string.
(defun get-line ()
(setf text (mapcar 'string (coerce (string (read)) 'list))))
(loop
for x = (string-upcase (string (read-char)))
while (not (equal " " x))
collecting (intern x))
Note the upcase is there because symbols in CL are not case sensitive and are upcased by default by the reader.

Common Lisp: How to quote parenthese in SBCL

In Common Lisp, the special operator quote makes whatever followed by un-evaluated, like
(quote a) -> a
(quote {}) -> {}
But why the form (quote ()) gives me nil? I'm using SBCL 1.2.6 and this is what I got in REPL:
CL-USER> (quote ())
NIL
More about this problem: This is some code from PCL Chapter 24
(defun as-keyword (sym)
(intern (string sym) :keyword))
(defun slot->defclass-slot (spec)
(let ((name (first spec)))
`(,name :initarg ,(as-keyword name) :accessor ,name)))
(defmacro define-binary-class (name slots)
`(defclass ,name ()
,(mapcar #'slot->defclass-slot slots)))
When the macro expand for the following code:
(define-binary-class id3-tag
((major-version)))
is
(DEFCLASS ID3-TAG NIL
((MAJOR-VERSION :INITARG :MAJOR-VERSION :ACCESSOR MAJOR-VERSION)))
which is NIL rather than () after the class name ID3-TAG.
nil and () are two ways to express the same concept (the empty list).
Traditionally, nil is used to emphasize the boolean value "false" rather than the empty list, and () is used the other way around.
The Common LISP HyperSpec says:
() ['nil], n. an alternative notation for writing the symbol nil, used
to emphasize the use of nil as an empty list.
Your observation is due to an object to having more than one representation. In Common Lisp the reader (that reads code and reads expressions) parses text to structure and data. When it's data the writer can print it out again but it won't know exactly how the data was represented when it was initially read in. The writer will print one object exactly one way, following defaults and settings, even though there are several representations for that object.
As you noticed nil, NIL, nIL, NiL, ... ,'nil, 'NIL, (), and '() are all read as the very same object. I'm not sure the standard dictates exactly how it's default representation out should be so I guess some implementations choose one of NIL, nil or maybe even ().
With cons the representation is dependent on the cdr being a cons/nil or not:
'(a . nil) ; ==> (a)
'(a . (b . c)) ; ==> (a b . c)
'(a . (b . nil)) ; ==> (a b)
With numbers the reader can get hints about which base you are using. If no base is used in the text it will use whatever *read-base* is:
(let ((*read-base* 2)) ; read numbers as boolean
(read-from-string "(10 #x10)")) ; ==> (2 16)
#x tells the reader to interpret the rest as a hexadecimal value. Now if your print-base would have been 4 the answer to the above would have been visualized as (2 100).
To sum it up.. A single value in Common Lisp may have several good representations and all of them would yield the very same value. How the value is printed will follow both implementation, settings and even arguments to the functions that produce them. Neither what it accepts as values in or the different ways it can visualize the value tells nothing about how the value actually gets stored internally.

extend racket syntax for backus-naur-forms

I have a set of derivation rules implemented in racket. We can assume that there aren't any optional , which means there are no rules containing pipes (in BNF): ::= |
In racket, I have got something like this:
(define *rules*
'((S . ("b" "a"))
(B . ("a"))
(C . (S B))))
Please note that terminal symbols are implemented in the form of racket strings, nonterminal symbols in the form of racket symbols. Now, I'd like to import this rules from another racket file containing rules in backus naur syntax:
S ::= ba
B ::= a
C ::= SB
(capital letter = nonterminal)
Therefore, I need to extend the racket syntax. I have no idea how to handle that. Can you help me? It shouldn't be that much code ...
I think you're looking to parse a file that's written using the BNF syntax, and produce an s-expression version; is that right?
If so, it shouldn't be hard. In particular, the format implied by your question is that every line is of the form
<NT> :: = [<NT>|<T>]*
... which you could take apart like this:
#lang racket
;; COPYRIGHT 2012 John B. Clements (clements#brinckerhoff.org)
;; Licensed under the Apache License, version 2.
;; (You're free to use it, but your source code has to include
;; my authorship.)
(require rackunit)
(define example
(list "S ::= ba"
"B ::= a"
"C ::= SB"))
;; parse a single line:
;; string -> (list/c symbol? (listof (or/c string? symbol?)))
(define (parse-line l)
(match (regexp-match #px"^([A-Z]) ::= ([A-Za-z]*)$")
[(list _ lhs rhses)
(list lhs (map parse-char (string->list rhses)))]))
;; parse a single char:
;; char -> (or/c symbol? string?)
(define (parse-char ch)
.. oops! out of time. You'll have to write this part yourself... )
(check-expect (map parse-line example)
'((S ("b" "a"))
(B ("a"))
(C (S B))))
Oops! I see a bug in there. No problem, you'll figure it out. Gotta run....

How do I convert a string to a symbol for use as a key in the Lisp "assoc" function?

I have this association-list in Common Lisp:
(defvar base-list (list (cons 'a 0) (cons 2 'c)))
I have to call assoc when my argument is of type string.
For the pair (A . 0) I have to convert "a" to a symbol, and for the pair (2 . C) I have to convert "2" to a symbol. How can I do that?
This should work like this:
CL-USER 28 : 1 > (assoc (convert-string-to-symbol "a") base-list)
(A . 0)
CL-USER 28 : 1 > (assoc (convert-number-to-symbol "2") base-list)
(2 . C)
I tried using intern but got NIL:
CL-USER 29 : 1 > (assoc (intern "a") base-list)
NIL
The function you want is called read-from-string:
CL-USER> (read-from-string "a")
A
1
CL-USER> (read-from-string "2")
2
1
CL-USER>
Note that solutions based on using intern or find-symbol would not work for strings representing numbers (e.g., "2") on most implementations.
You were close with intern; you just had the case wrong. Try this:
> (assoc (intern "A") base-list)
(A . 0)
Note that here the name-as-string is capitalized.
Alternately, you could use find-symbol to look for an existing symbol by name:
> (assoc (find-symbol "A") base-list)
(A . 0)
The key here is that when you wrote your original defvar form, the reader read the string "a" and—by virtue of the current readtable case—converted the symbol name to be uppercase. Symbols with names of different case are not equal. It just so happens that at read time the reader is projecting what you wrote (lowercase) to something else (uppercase).
You can inspect the current case conversion policy for the current reader using the readtable-case function:
> (readtable-case *readtable*)
:UPCASE
To learn more about how the readtable case and the reader interact, see the discussion in section 23.1.2 of the Hyperspec.