If strings are vectors, why are they immutable? - lisp

if strings are vectors of characters, and a vector's elements can be accessed using elt, and elt is setf-able - then why are strings immutable?

Strings are not immutable in Common Lisp, they are mutable:
* is the last result from the Lisp listener
CL-USER 5 > (make-string 10 :initial-element #\-)
"----------"
CL-USER 6 > (setf (aref * 5) #\|)
#\|
CL-USER 7 > **
"-----|----"
CL-USER 8 > (concatenate 'string "aa" '(#\b #\b))
"aabb"
CL-USER 9 > (setf (aref * 2) #\|)
#\|
CL-USER 10 > **
"aa|b"
The only thing you should not do is modifying literal strings in your code. The consequences are undefined. That's the same issue with other literal data.
For example a file contains:
(defparameter *lisp-prompt* "> ")
(defparameter *logo-prompt* "> ")
If we compile the file with compile-file, then the compiler might detect that those strings are equal and allocate only one string. It might put them into read-only memory. There are other issues as well.
Summary
Strings are mutable.
Don't modify literal strings in your code.

If you need mutability, don't use literal strings. Like so:
* (defparameter *s* (format nil "aaa"))
*S*
* (setf (elt *s* 1) #\b)
#\b
* *s*
"aba"

Related

What is the difference between #\ , ' and #'?

In Common Lisp, given that "a" is simply a character, what is the difference between #\a, 'a #'a?
My question comes from the tutorialspoint.com tutorial on Lisp. At one point the tutorial introduces:
; a character array with all initial elements set to a
; is a string actually
(write(make-array 10 :element-type 'character :initial-element #\a))
(terpri)
; a two dimensional array with initial values a
(setq myarray (make-array '(2 2) :initial-element 'a :adjustable t))
(write myarray)
(terpri)
With the output:
"aaaaaaaaaa"
#2A((A A) (A A))
#' is not included in this example but I'm including it in the question because it can be confusing as well. 🙂
Thank you very much! 😊
To start, a is not "simply a character." The Lisp reader parses #\a as the character literal a, which is an object in Common Lisp. Note that #\a and #\A are different character objects.
When the Lisp reader encounters a single quote, the expression following the single quote is not evaluated. Specifically, 'a is treated as (quote a), where quote returns its argument unevaluated. Now, a is a symbol, so 'a evaluates to that symbol. But the Lisp reader upcases most characters it reads by default, so 'a really evaluates to the symbol A. The good news is that whether you type a or A, the Lisp reader will read A (unless you mess with the readtable), and both 'a and 'A evaluate to the symbol A.
When the Lisp reader encounters #'a, the entire expression is treated as (function a), which when evaluated returns the function associated with the name a. But, note that it is an error to use function, and by extension #', on an identifier that does not denote a function.
To clarify this last part a bit, consider the following REPL interaction:
CL-USER> (defvar a 1)
A
CL-USER> a
1
CL-USER> #'a
The function COMMON-LISP-USER::A is undefined.
[Condition of type UNDEFINED-FUNCTION]
Here the variable a is defined and given the value 1, but when we try to access the function denoted by a we get an error message because there is no such function. Continuing:
; Evaluation aborted on #<UNDEFINED-FUNCTION A {1002DDC303}>.
CL-USER> (defun a (x) x)
A
CL-USER> (a 'b)
B
CL-USER> a
1
CL-USER> #'a
#<FUNCTION A>
Now we have defined a function named a that simply returns its argument. You can see that when we call a with an argument 'b we get the expected result: (a 'b) --> b. But, then when we evaluate a alone we still get 1. Symbols in Common Lisp are objects that have, among other cells, value cells and function cells. After the above interaction, the symbol a now has 1 in its value cell, and it has the function we have defined in its function cell. When the symbol a is evaluated the value cell is accessed, but when (function a) or #'a is evaluated, the function cell is accessed. You can see above that when #'a is evaluated, the function we defined is returned, and the REPL prints #<FUNCTION A> to show this.
As an aside, I wouldn't recommend using Tutorialspoint to learn Common Lisp. Glancing over the site, right away I see this:
LISP expressions are case-insensitive, cos 45 or COS 45 are same.
This is just wrong. And, Lisp is not written in all-caps. None of this inspires faith. Instead, find a good book. There are some recommendations on the common-lisp tag-info page.
#\
This is to introduce a character.
CL-USER> #\a
#\a
CL-USER> (character 'a)
#\A
CL-USER> (character "a")
#\a
'
This is quote, to quote and not evaluate things and construct object literals.
CL-USER> a
=> error: the variable a is unbound.
CL-USER> 'a
A
CL-USER> (inspect 'a)
The object is a SYMBOL.
0. Name: "A"
1. Package: #<PACKAGE "COMMON-LISP-USER">
2. Value: "unbound"
3. Function: "unbound"
4. Plist: NIL
> q
CL-USER> (equal (list 1 2) (quote (1 2))) ;; aka '(1 2)
T ;; but watch out with object literals constructed with quote, prefer constructor functions.
and #'
This is sharpsign-quote to reference a function.
CL-USER> #'a
=> error: The function COMMON-LISP-USER::A is undefined.
CL-USER> (defun a () (print "hello A"))
A
CL-USER> (a)
"hello A"
"hello A"
CL-USER> #'a
#<FUNCTION A>
CL-USER> (function a)
#<FUNCTION A>
One can ask Lisp to describe the data objects you've mentioned.
If we look at the expressions:
CL-USER 13 > (dolist (object (list '#\a ''a '#'a))
(terpri)
(describe object)
(terpri))
#\a is a CHARACTER
Name "Latin-Small-Letter-A"
Code 97
(QUOTE A) is a LIST
0 QUOTE
1 A
(FUNCTION A) is a LIST
0 FUNCTION
1 A
NIL
If we look at the evaluated expressions:
CL-USER 5 > (dolist (object (list #\a 'a #'a))
(terpri)
(describe object)
(terpri))
#\a is a CHARACTER
Name "Latin-Small-Letter-A"
Code 97
A is a SYMBOL
NAME "A"
VALUE #<unbound value>
FUNCTION #<interpreted function A 422005BD54>
PLIST NIL
PACKAGE #<The COMMON-LISP-USER package, 73/256 internal, 0/4 external>
#<interpreted function A 422005BD54> is a TYPE::INTERPRETED-FUNCTION
CODE (LAMBDA (B)
A)

symbols handling: cannot compare for identity

I don't get why
(setq a_sym 'abc)
(print (eq a_sym 'abc))
(print (eq 'x 'x))
(print (eq (first '('x 2 3)) 'x))
prints
T
T
NIL
Why the symbol 'x in the third statement is handled differently than second ? And, down to earth, how to compare them for identity ?
If you trace what you are comparing, you will see your mistake right away:
[1]> (eq (first '('x 2 3)) 'x)
NIL
[2]> (trace eq)
** - Continuable Error
TRACE(EQ): #<PACKAGE COMMON-LISP> is locked
If you continue (by typing 'continue'): Ignore the lock and proceed
The following restarts are also available:
ABORT :R1 Abort main loop
Break 1 [3]> :c
WARNING: TRACE: redefining function EQ in top-level, was defined in C
;; Tracing function EQ.
(EQ)
[4]> (eq (first '('x 2 3)) 'x)
1. Trace: (EQ ''X 'X) ; <======= note 1
1. Trace: EQ ==> NIL
NIL
[5]> (eq (first '(x 2 3)) 'x)
1. Trace: (EQ 'X 'X) ; <======= note 2
1. Trace: EQ ==> T
T
IOW, you are "overquoting" your x: when you type 'x, it's the same as (quote x) so, you are checking for equality of symbol x and list (quote x) and, of course, getting nil.
Notes:
(eq ''x 'x): since eq is a function, its arguments are evaluated and we are comparing 'x == (quote x) with x and getting nil.
(eq 'x 'x): for the same reason, we are comparing x with x and getting t.
Related:
Lisp quote work internally
Confused by Lisp Quoting
When to use 'quote in Lisp
Lisp: quoting a list of symbols' values
Syntax and reading
You wrote:
symbol 'x
Note that 'x is not a symbol. It's a quote character in front of a symbol. The quote character has a special meaning in s-expressions: read the next item and enclose it in (quote ...).
Thus 'x is really the list (quote x).
CL-USER 9 > (read-from-string "'x")
(QUOTE X)
2
Evaluation
A quoted object is not evaluated. The quote is a special operator, which means it is built-in syntax/semantics in Common Lisp and not a function and not a macro. The evaluator returns the quoted object as such:
CL-USER 10 > (quote x)
X
Your example
(eq (first (quote ((quote x) 2 3))) (quote x))
Let's evaluate the first part:
CL-USER 13 > (first (quote ((quote x) 2 3)))
(QUOTE X)
The result is the list (quote x).
Let's evaluate the second part:
CL-USER 14 > 'x
X
So the result is the symbol x.
x and (quote x) are not eq.
Evaluation and Quoting
'('x 2 3)
What would be the purpose of the second quote in the list?
The first quote already means that the WHOLE following data structure is not to be evaluated. Thus it is not necessary to quote a symbol inside to prevent its evaluation. If a list is quoted, none of its sublists or subelements are evaluated.
Summary
The quote is not part of the symbol. It is a built-in special operator used to prevent evaluation.

Can a Lisp read procedure read this and how?

I'm writing a grammar which I intend to implement in a Lisp read procedure, i.e. reading one expression at a time from an input source which is i.e. mutable. Most of the grammar is just like Lisp, but the two pertinent changes are:
Whitespace is read and is part of the resulting syntax. Contiguous whitespace is grouped together like contiguous non-whitespace characters are grouped as identifiers, and the result of reading such a string is a "whitespace object", which stores the exact sequence of characters read. The evaluator ignores whitespace objects when they appear in a list (in other words, if foo is a whitespace object then (eval '(+ 3 foo 4)) is equivalent to (eval '(+ 3 4))), and if it is asked to evaluate one directly, it is self-evaluating.
Secondly, if several tokens other than whitespace tokens appear on the same line, those tokens are collected into a list and that list is the result of the read.
e.g.,
+ 3 4 5
(+ 3 4 5)
+ 3 4 (+ 1 4)
(+ 3 4 (+ 1 4))
all produce the value 12.
Is it possible to implement this reader as a Lisp read procedure that follows the typical expectations of a read procedure? If so, how? (I'm at a loss.)
Edit: Clarification on whitespace:
If we say that a "whitespace object" is simply a string and read, then reading the following segment:
(foo bar baz)
produces a syntax object like:
'(foo " " bar " " baz)
In other words, the whitespace between tokens is stored in the resultant syntax object.
Suppose I write a macro named ->, which takes a syntax object (scheme style macro), and whitespace? is a predicate identifying whitespace syntax objects
(define-macro (-> stx)
(let* ((stxl (syntax-object->list stx))
(obj (car stxl))
(let proc ((res empty))
(lst (cdr stxl)))
(let ((method (car lst)))
(if (whitespace? method)
; skip whitespace, recur immediately
(proc res (cdr lst))
; Insert obj as the second element in method
(let ((modified-method (cons (car method)
(cons obj (cdr method)))))
; recur
(proc (cons res modified-method) (cdr lst))))))))
The reading part of this is pretty easy. You just need a whitespace test, and then your reading function will install a custom reader character macro that detects whitespace and reads consecutive sequences of whitespace into a single object. First, the whitespace test and a whitespace object; these are pretty simple:
(defparameter *whitespace*
#(#\space #\tab #\return #\newline)
"A vector of whitespace characters.")
(defun whitespace-p (char)
"Returns true if CHAR is in *WHITESPACE*."
(find char *whitespace* :test 'char=))
(defstruct whitespace-object
characters)
Now the macro character function:
(defun whitespace-macro-char (stream char)
"A macro character function that consumes characters from
stream (including CHAR), until a non-whitespace character (or end of
file) is encountered. Returns a whitespace-object whose characters
slot contains a string of the whitespace characters."
(let ((chars (loop for c = (peek-char nil stream nil #\a)
while (whitespace-p c)
collect (read-char stream))))
(make-whitespace-object
:characters (coerce (list* char chars) 'string))))
Now the read function just has the same signature as the normal read, but copies the readtable, then installs the macro function, and calls read. The result from read is returned, and the readtable is restored:
(defun xread (&optional (stream *standard-input*) (eof-error-p t) eof-value recursive-p)
"Like READ, but called with *READTABLE* bound to a readtable in
which each whitespace characters (that is, each character in
*WHITESPACE*) is a macro characters whose macro function is
WHITESPACE-MACRO-CHAR."
(let ((rt (copy-readtable)))
(map nil (lambda (wchar)
(set-macro-character wchar #'whitespace-macro-char))
*whitespace*)
(unwind-protect (read stream eof-error-p eof-value recursive-p)
(setf *readtable* rt))))
Example:
(with-input-from-string (in "(+ 1 2 (* 3
4))")
(xread in))
(+ #S(WHITESPACE-OBJECT :CHARACTERS " ") 1
#S(WHITESPACE-OBJECT :CHARACTERS " ") 2
#S(WHITESPACE-OBJECT :CHARACTERS " ")
(* #S(WHITESPACE-OBJECT :CHARACTERS " ") 3
#S(WHITESPACE-OBJECT
:CHARACTERS "
")
4))
Now, to implement the eval counterpart that you want, you need to be able to remove whitespace objects from lists. This isn't too hard, and we can write a slightly more general utility function to do it for us:
(defun remove-element-if (predicate tree)
"Returns a new tree like TREE, but which contains no elements in an
element position which ssatisfy PREDICATE. An element is in element
position if it is the car of some cons cell in TREE."
(if (not (consp tree))
tree
(if (funcall predicate (car tree))
(remove-element-if predicate (cdr tree))
(cons (remove-element-if predicate (car tree))
(remove-element-if predicate (cdr tree))))))
CL-USER> (remove-element-if (lambda (x) (and (numberp x) (evenp x))) '(+ 1 2 3 4))
(+ 1 3)
CL-USER> (with-input-from-string (in "(+ 1 2 (* 3
4))")
(remove-element-if 'whitespace-object-p (xread in)))
(+ 1 2 (* 3 4))
So now the evaluation function is a simple wrapper around eval:
(defun xeval (form)
(eval (remove-element-if 'whitespace-object-p form)))
CL-USER> (with-input-from-string (in "(+ 1 2 (* 3
4))")
(xeval (xread in)))
15
Let's make sure that standalone whitespace objects still appear as expected:
CL-USER> (with-input-from-string (in " ")
(let* ((exp (xread in))
(val (xeval exp)))
(values exp val)))
#S(WHITESPACE-OBJECT :CHARACTERS " ")
#S(WHITESPACE-OBJECT :CHARACTERS " ")

Convert char to number

I'm in the process of reading a flat file - to use the characters read I want to convert them into numbers. I wrote a little function that converts a string to a vector:
(defun string-to-vec (strng)
(setf strng (remove #\Space strng))
(let ((vec (make-array (length strng))))
(dotimes (i (length strng) vec)
(setf (svref vec i) (char strng i)))))
However this returns a vector with character entries. Short of using char-code to convert unit number chars to numbers in a function, is there a simple way to read numbers as numbers from a file?
In addition to Rainer's answer, let me mention read-from-string (note that Rainer's code is more efficient than repeated application of read-from-string because it only creates a stream once) and parse-integer (alas, there is no parse-float).
Note that if you are reading a CSV file, you should probably use an off-the-shelf library instead of writing your own.
Above is shorter:
? (map 'vector #'identity (remove #\Space "123"))
#(#\1 #\2 #\3)
You can convert a string:
(defun string-to-vector-of-numbers (string)
(coerce
(with-input-from-string (s string)
(loop with end = '#:end
for n = (read s nil end)
until (eql n end)
unless (numberp n) do (error "Input ~a is not a number." n)
collect n))
'vector))
But it would be easier to read the numbers directly form the file. Use READ, which can read numbers.
Note that read-like functions are affected by reader macros.
Pick an example:
* (defvar *foo* 'bar)
*FOO*
* (read-from-string "#.(setq *foo* 'baz)")
BAZ
19
* *foo*
BAZ
As you can see read-from-string can implicitly set a variable. You can disable the #. reader macro by setting *read-eval* to nil but anyway if you have only integers on the input then consider using parse-integer instead.

How do I convert a string to a symbol for use as a key in the Lisp "assoc" function?

I have this association-list in Common Lisp:
(defvar base-list (list (cons 'a 0) (cons 2 'c)))
I have to call assoc when my argument is of type string.
For the pair (A . 0) I have to convert "a" to a symbol, and for the pair (2 . C) I have to convert "2" to a symbol. How can I do that?
This should work like this:
CL-USER 28 : 1 > (assoc (convert-string-to-symbol "a") base-list)
(A . 0)
CL-USER 28 : 1 > (assoc (convert-number-to-symbol "2") base-list)
(2 . C)
I tried using intern but got NIL:
CL-USER 29 : 1 > (assoc (intern "a") base-list)
NIL
The function you want is called read-from-string:
CL-USER> (read-from-string "a")
A
1
CL-USER> (read-from-string "2")
2
1
CL-USER>
Note that solutions based on using intern or find-symbol would not work for strings representing numbers (e.g., "2") on most implementations.
You were close with intern; you just had the case wrong. Try this:
> (assoc (intern "A") base-list)
(A . 0)
Note that here the name-as-string is capitalized.
Alternately, you could use find-symbol to look for an existing symbol by name:
> (assoc (find-symbol "A") base-list)
(A . 0)
The key here is that when you wrote your original defvar form, the reader read the string "a" and—by virtue of the current readtable case—converted the symbol name to be uppercase. Symbols with names of different case are not equal. It just so happens that at read time the reader is projecting what you wrote (lowercase) to something else (uppercase).
You can inspect the current case conversion policy for the current reader using the readtable-case function:
> (readtable-case *readtable*)
:UPCASE
To learn more about how the readtable case and the reader interact, see the discussion in section 23.1.2 of the Hyperspec.