Circular pointers in lisp - lisp

In working through the CLRS Intro to Algorithms book and attempting to implement a red-black binary search tree in common lisp, I came across the following issue with circular pointers:
(defstruct node (ptr nil))
(defparameter *x* (make-node))
(defparameter *y* (make-node :ptr *x*))
(setf (node-ptr *x*) *y*)
This code leads to a heap-exhausted-error, presumably due to an infinite recursion caused by having a pointer to a pointer which points back to that pointer, etc.
Is there a way to prevent this infinite recursion from happening while maintaining the pointer structure given here?
I know there are other ways to implement red-black trees (without using setf, for example), but I am interested in reproducing the imperative style in CLRS as common lisp is a multi-paradigm language.
PS. The BST in CLRS have parent pointers in addition to the usual left-child and right-child pointers.

There is no problem with circularity in Lisp. Common Lisp even has a special syntax for expressing it at read-time: For instance something like #1=(#1# . #1#) is a cons, both of whose elements are the cons itself: you could construct this explicitly by an expression like
(let ((c (cons nil nil)))
(setf (car c) c
(cdr c) c))
However there is a problem when printing structure which may contain circularity. In order to do this correctly you need something called an occurs check: as you are printing objects (in particular objects which have components) you need to keep track of whether you have seen this object already, and if you have you arrange to print a reference, which CL does by printing #n#, where n is an integer which tells the reader -- both the human reader and the Lisp reader -- which object this corresponds to, so they/it can reconstruct the structure, including its sharing. What's worse is that you have to either annotate each possibly-shared object (with #n=) when you start printing it, which would be terrible, or avoid printing anything until you have run over all the objects to know which ones you need to so annotate.
An occurs check is computationally expensive in space (or it seemed so in the 1980s when CL was being standardised: it still can be for very large structures of course), so it is not the default behaviour of the CL printer, but it is controlled by a special variable *print-circle*: if this is true then circularity (and in fact general shared-structure) detection is done by the printer; if it is false, it is not, and the printer may loop or recurse unboundedly when printing circular structure.
Note that the problem is more general than circularity: how should this object be printed:
(let ((c1 (cons nil nil))
(c2 (cons nil nil)))
(setf (car c1) c2
(cdr c1) c2))
Well, we can construct this explicitly like this:
(#1=(nil) . #1#)
And this is how it will be printed, if *print-circle* is true, because in that case the reader detects shared structure and prints it properly. Here is a demonstration of all this:
(let ((unshared (cons (cons nil nil) (cons nil nil)))
(shared (cons (cons nil nil) nil))
(circular (cons nil nil)))
;; construct the sharing explicitly rather than via syntax
(setf (cdr shared) (car shared)
(car circular) circular
(cdr circular) circular)
(with-standard-io-syntax
(let ((*print-circle* nil)) ;it is anyway
;; don't print cicrular!
(pprint unshared)
(pprint shared))
(let ((*print-circle* t))
(pprint unshared)
(pprint shared)
(pprint circular)))
;; be careful not to return anything possibly toxic to the printer
(values))
This will print
((NIL) NIL)
((NIL) NIL)
((NIL) NIL)
(#1=(NIL) . #1#)
#1=(#1# . #1#)
Note that certain very old Lisps (notably InterLisp) used reference-counting for storage management and while the language had no problems with circularity the reference-counter did. I'm sure no modern language uses reference-counting however.

Related

lisp: concat arbitrary number of lists

In my ongoing quest to recreate lodash in lisp as a way of getting familiar with the language I am trying to write a concat-list function that takes an initial list and an arbitrary number of additional lists and concatenates them.
I'm sure that this is just a measure of getting familiar with lisp convention, but right now my loop is just returning the second list in the argument list, which makes sense since it is the first item of other-lists.
Here's my non-working code (edit: refactored):
(defun concat-list (input-list &rest other-lists)
;; takes an arbitrary number of lists and merges them
(loop
for list in other-lists
append list into input-list
return input-list
)
)
Trying to run (concat-list '(this is list one) '(this is list two) '(this is list three)) and have it return (this is list one this is list two this is list three).
How can I spruce this up to return the final, merged list?
The signature of your function is a bit unfortunate, it becomes easier if you don't treat the first list specially.
The easy way:
(defun concat-lists (&rest lists)
(apply #'concatenate 'list lists))
A bit more lower level, using loop:
(defun concat-lists (&rest lists)
(loop :for list :in lists
:append list))
Going lower, using dolist:
(defun concat-lists (&rest lists)
(let ((result ()))
(dolist (list lists (reverse result))
(setf result (revappend list result)))))
Going even lower would maybe entail implementing revappend yourself.
It's actually good style in Lisp not to use LABELS based iteration, since a) it's basically a go-to like low-level iteration style and it's not everywhere supported. For example the ABCL implementation of Common Lisp on the JVM does not support TCO last I looked. Lisp has wonderful iteration facilities, which make the iteration intention clear:
CL-USER 217 > (defun my-append (&rest lists &aux result)
(dolist (list lists (nreverse result))
(dolist (item list)
(push item result))))
MY-APPEND
CL-USER 218 > (my-append '(1 2 3) '(4 5 6) '(7 8 9))
(1 2 3 4 5 6 7 8 9)
Some pedagogical solutions to this problem
If you just want to do this, then use append, or nconc (destructive), which are the functions which do it.
If you want to learn how do to it, then learning about loop is not how to do that, assuming you want to learn Lisp: (loop for list in ... append list) really teaches you nothing but how to write a crappy version of append using arguably the least-lispy part of CL (note I have nothing against loop & use it a lot, but if you want to learn lisp, learning loop is not how to do that).
Instead why not think about how you would write this if you did not have the tools to do it, in a Lispy way.
Well, here's how you might do that:
(defun append-lists (list &rest more-lists)
(labels ((append-loop (this more results)
(if (null this)
(if (null more)
(nreverse results)
(append-loop (first more) (rest more) results))
(append-loop (rest this) more (cons (first this) results)))))
(append-loop list more-lists '())))
There's a dirty trick here: I know that results is completely fresh so I am using nreverse to reverse it, which does so destructively. Can we write nreverse? Well, it's easy to write reverse, the non-destructive variant:
(defun reverse-nondestructively (list)
(labels ((r-loop (tail reversed)
(if (null tail)
reversed
(r-loop (rest tail) (cons (first tail) reversed)))))
(r-loop list '())))
And it turns out that a destructive reversing function is only a little harder:
(defun reverse-destructively (list)
(labels ((rd-loop (tail reversed)
(if (null tail)
reversed
(let ((rtail (rest tail)))
(setf (rest tail) reversed)
(rd-loop rtail tail)))))
(rd-loop list '())))
And you can check it works:
> (let ((l (make-list 1000 :initial-element 1)))
(time (reverse-destructively l))
(values))
Timing the evaluation of (reverse-destructively l)
User time = 0.000
System time = 0.000
Elapsed time = 0.000
Allocation = 0 bytes
0 Page faults
Why I think this is a good approach to learning Lisp
[This is a response to a couple of comments which I thought was worth adding to the answer: it is, of course, my opinion.]
I think that there are at least three different reasons for wanting to solve a particular problem in a particular language, and the approach you might want to take depends very much on what your reason is.
The first reason is because you want to get something done. In that case you want first of all to find out if it has been done already: if you want to do x and the language a built-in mechanism for doing x then use that. If x is more complicated but there is some standard or optional library which does it then use that. If there's another language you could use easily which does x then use that. Writing a program to solve the problem should be something you do only as a last resort.
The second reason is because you've fallen out of the end of the first reason, and you now find yourself needing to write a program. In that case what you want to do is use all of the tools the language provides in the best way to solve the problem, bearing in mind things like maintainability, performance and so on. In the case of CL, then if you have some problem which naturally involves looping, then, well, use loop if you want to. It doesn't matter whether loop is 'not lispy' or 'impure' or 'hacky': just do what you need to do to get the job done and make the code maintainable. If you want to print some list of objects, then by all means write (format t "~&~{~A~^, ~}~%" things).
The third reason is because you want to learn the language. Well, assuming you can program in some other language there are two approaches to doing this.
the first is to say 'I know how to do this thing (write loops, say) in languages I know – how do I do it in Lisp?', and then iterate this for all the thing you already know how to do in some other language;
the second is to say 'what is it that makes Lisp distinctive?' and try and understand those things.
These approaches result in very approaches to learning. In particular I think the first approach is often terrible: if the language you know is, say, Fortran, then you'll end up writing Fortran dressed up as Lisp. And, well, there are perfectly adequate Fortran compilers out there: why not use them? Even worse, you might completely miss important aspects of the language and end up writing horrors like
(defun sum-list (l)
(loop for i below (length l)
summing (nth i l)))
And you will end up thinking that Lisp is slow and pointless and return to the ranks of the heathen where you will spread such vile calumnies until, come the great day, the golden Lisp horde sweeps it all away. This has happened.
The second approach is to ask, well, what are the things that are interesting about Lisp? If you can program already, I think this is a much better approach to the first, because learning the interesting and distinctive features of a language first will help you understand, as quickly as possible, whether its a language you might actually want to know.
Well, there will inevitably be argument about what the interesting & distinctive features of Lisp are, but here's a possible, partial, set.
The language has a recursively-defined data structure (S expressions or sexprs) at its heart, which is used among other things to represent the source code of the language itself. This representation of the source is extremely low-commitment: there's nothing in the syntax of the language which says 'here's a block' or 'this is a conditiona' or 'this is a loop'. This low-commitment can make the language hard to read, but it has huge advantages.
Recursive processes are therefore inherently important and the language is good at expressing them. Some variants of the language take this to the extreme by noticing that iteration is simply a special case of recursion and have no iterative constructs at all (CL does not do this).
There are symbols, which are used as names for things both in the language itself and in programs written in the language (some variants take this more seriously than others: CL takes it very seriously).
There are macros. This really follows from the source code of the language being represented as sexprs and this structure having a very low commitment to what it means. Macros, in particular, are source-to-source transformations, with the source being represented as sexprs, written in the language itself: the macro language of Lisp is Lisp, without restriction. Macros allow the language itself to be seamlessly extended: solving problems in Lisp is done by designing a language in which the problem can be easily expressed and solved.
The end result of this is, I think two things:
recursion, in addition to and sometimes instead of iteration is an unusually important technique in Lisp;
in Lisp, programming means building a programming language.
So, in the answer above I've tried to give you examples of how you might think about solving problems involving a recursive data structure recursively: by defining a local function (append-loop) which then recursively calls itself to process the lists. As Rainer pointed out that's probably not a good way of solving this problem in Common Lisp as it tends to be hard to read and it also relies on the implementation to turn tail calls into iteration which is not garuanteed in CL. But, if your aim is to learn to think the way Lisp wants you to think, I think it is useful: there's a difference between code you might want to write for production use, and code you might want to read and write for pedagogical purposes: this is pedagogical code.
Indeed, it's worth looking at the other half of how Lisp might want you to think to solve problems like this: by extending the language. Let's say that you were programming in 1960, in a flavour of Lisp which has no iterative constructs other than GO TO. And let's say you wanted to process some list iteratively. Well, you might write this (this is in CL, so it is not very like programming in an ancient Lisp would be: in CL tagbody establishes a lexical environment in the body of which you can have tags – symbols – and then go will go to those tags):
(defun print-list-elements (l)
;; print the elements of a list, in order, using GO
(let* ((tail l)
(current (first tail)))
(tagbody
next
(if (null tail)
(go done)
(progn
(print current)
(setf tail (rest tail)
current (first tail))
(go next)))
done)))
And now:
> (print-list-elements '(1 2 3))
1
2
3
nil
Let's program like it's 1956!
So, well, let's say you don't like writing this sort of horror. Instead you'd like to be able to write something like this:
(defun print-list-elements (l)
;; print the elements of a list, in order, using GO
(do-list (e l)
(print e)))
Now if you were using most other languages you need to spend several weeks mucking around with the compiler to do this. But in Lisp you spend a few minutes writing this:
(defmacro do-list ((v l &optional (result-form nil)) &body forms)
;; Iterate over a list. May be buggy.
(let ((tailn (make-symbol "TAIL"))
(nextn (make-symbol "NEXT"))
(donen (make-symbol "DONE")))
`(let* ((,tailn ,l)
(,v (first ,tailn)))
(tagbody
,nextn
(if (null ,tailn)
(go ,donen)
(progn
,#forms
(setf ,tailn (rest ,tailn)
,v (first ,tailn))
(go ,nextn)))
,donen
,result-form))))
And now your language has an iteration construct which it previously did not have. (In real life this macro is called dolist).
And you can go further: given our do-list macro, let's see how we can collect things into a list:
(defun collect (thing)
;; global version: just signal an error
(declare (ignorable thing))
(error "not collecting"))
(defmacro collecting (&body forms)
;; Within the body of this macro, (collect x) will collect x into a
;; list, which is returned from the macro.
(let ((resultn (make-symbol "RESULT"))
(rtailn (make-symbol "RTAIL")))
`(let ((,resultn '())
(,rtailn nil))
(flet ((collect (thing)
(if ,rtailn
(setf (rest ,rtailn) (list thing)
,rtailn (rest ,rtailn))
(setf ,resultn (list thing)
,rtailn ,resultn))
thing))
,#forms)
,resultn)))
And now we can write the original append-lists function entirely in terms of constructs we've invented:
(defun append-lists (list &rest more-lists)
(collecting
(do-list (e list) (collect e))
(do-list (l more-lists)
(do-list (e l)
(collect e)))))
If that's not cool then nothing is.
In fact we can get even more carried away. My original answer above used labels to do iteration As Rainer has pointed out, this is not safe in CL since CL does not mandate TCO. I don't particularly care about that (I am happy to use only CL implementations which mandate TCO), but I do care about the problem that using labels this way is hard to read. Well, you can, of course, hide this in a macro:
(defmacro looping ((&rest bindings) &body forms)
;; A sort-of special-purpose named-let.
(multiple-value-bind (vars inits)
(loop for b in bindings
for var = (typecase b
(symbol b)
(cons (car b))
(t (error "~A is hopeless" b)))
for init = (etypecase b
(symbol nil)
(cons (unless (null (cddr b))
(error "malformed binding ~A" b))
(second b))
(t
(error "~A is hopeless" b)))
collect var into vars
collect init into inits
finally (return (values vars inits)))
`(labels ((next ,vars
,#forms))
(next ,#inits))))
And now:
(defun append-lists (list &rest more-lists)
(collecting
(looping ((tail list) (more more-lists))
(if (null tail)
(unless (null more)
(next (first more) (rest more)))
(progn
(collect (first tail))
(next (rest tail) more))))))
And, well, I just think it is astonishing that I get to use a programming language where you can do things like this.
Note that both collecting and looping are intentionally 'unhygenic': they introduce a binding (for collect and next respectively) which is visible to code in their bodies and which would shadow any other function definition of that name. That's fine, in fact, since that's their purpose.
This kind of iteration-as-recursion is certainly cool to think about, and as I've said I think it really helps you to think about how the language can work, which is my purpose here. Whether it leads to better code is a completely different question. Indeed there is a famous quote by Guy Steele from one of the 'lambda the ultimate ...' papers:
procedure calls may be usefully thought of as GOTO statements which also pass parameters
And that's a lovely quote, except that it cuts both ways: procedure calls, in a language which optimizes tail calls, are pretty much GOTO, and you can do almost all the horrors with them that you can do with GOTO. But GOTO is a problem, right? Well, it turns out so are procedure calls, for most of the same reasons.
So, pragmatically, even in a language (or implementation) where procedure calls do have all these nice characteristics, you end up wanting constructs which can express iteration and not recursion rather than both. So, for instance, Racket which, being a Scheme-family language, does mandate tail-call elimination, has a whole bunch of macros with names like for which do iteration.
And in Common Lisp, which does not mandate tail-call elimination but which does have GOTO, you also need to build macros to do iteration, in the spirit of my do-list above. And, of course, a bunch of people then get hopelessly carried away and the end point is a macro called loop: loop didn't exist (in its current form) in the first version of CL, and it was common at that time to simply obtain a copy of it from somewhere, and make sure it got loaded into the image. In other words, loop, with all its vast complexity, is just a macro which you can define in a CL which does not have it already.
OK, sorry, this is too long.
(loop for list in (cons '(1 2 3)
'((4 5 6) (7 8 9)))
append list)

Why is (type-of list) equal to CONS?

I am playing around with Common Lisp and just realized that
(type-of (cons 1 2)) is CONS
and
(type-of (list 1 2)) is also CONS
However the two are clearly not the same because all "proper" lists, must be conses with second element being a list.
That said, when there are only two elements, the second element is 2, and first element is 1, neither is a list, but the construct is also still called a cons.
This gets even more confusing since
(print (list (cons 1 2) 3)) ; this is a ((1 . 2) 3), an improper list, but still cons
(print (cons 1 (list 2 3))) ; this is a (1 2 3), a proper list, but still cons
(cons 1 (cons 2 3)) ; is not a proper list, but is a (1 2 . 3), but still cons...
All are cons, but why isn't (list 1 2) a list? It can't be a cons because cons and list must be different types in order to be told apart in the algorithm for determining whether or not it is a proper list (and in turn, (equal (list 1 2) (cons 1 2)) should be true; without this discrimination, there should be no difference between a cons and a list, there would just be a cons.
Can somebody please help me understand why it says that (type-of (list 1 2)) is cons, even though it is clearly a list (otherwise it would be an improper list to my understanding).
Proper and improper lists are not defined at the type level. This would require recursive type definitions which is only possible to do with Lisp with a satisfies type, and in that case type-of would still not return a type-specifier as complex:
b. the type returned does not involve and, eql,
member, not, or, satisfies or values.
The list type could be defined as (or cons null):
The types cons and null form an exhaustive partition of the type list.
That means that nil is a list, and any cons cell is a list. See also the definition of listp.
In other words:
(typep '(a b c) 'cons)
=> T
But also:
(typep '(a b c) 'list)
=> T
And of course this is true for any supertype:
(typep '(a b c) 'sequence)
=> T
(typep '(a b c) 't)
=> T
The type-of function returns the most basic type, i.e. cons, which can be thought of as the type for which no other subtype satisfy typep (but read the specification which gives the actual definition).
Remarks
Just to clarify:
(cons 1 2)
... is a list, but it cannot be passed to functions that expect proper lists like map, etc. This is checked at runtime and generally, there is no confusion because the cases where one use improper lists are actually quite rare (when you treat cons cells as trees, for example). Likewise, circular lists require special treatment.
In order to check if a list is proper or not, you only need to check whether the last cons has a nil or not as its cdr.
Also, I saw that you wrote:
((1 . 2) 3) ; [...] an improper list
What you have here is a proper-list of two elements where the first one is an improper list, a.k.a. a dotted-list.
#coredump's answer is the correct one, but it's perhaps useful to see pragmatic reasons why it's correct.
Firstly, it's quite desirable that typechecks are quick. So if I say (typep x 'list), I'd like it not to have to go away for a long time to do the check.
Well, consider what a proper list checker has to look like. Something like this, perhaps:
(defun proper-list-p (x)
(typecase x
(null t)
(cons (proper-list-p (rest x)))
(t nil)))
For any good CL compiler this is a loop (and it can obviously be rewritten as an explicit loop if you might need to deal with rudimentary compilers). But it's a loop which is as long as the list you are checking, and this fails the 'typechecks should be quick' test.
In fact it fails a more serious test: typechecks should terminate. Consider a call like (proper-list-p #1=(1 . #1#)). Oops. So we need something like this, perhaps:
(defun proper-list-p (x)
(labels ((plp (thing seen)
(typecase thing
(null (values t nil))
(cons
(if (member thing seen)
(values nil t) ;or t t?
(plp (rest thing)
(cons thing seen))))
(t (values nil nil)))))
(plp x '())))
Well, this will terminate (and tell you whether the list is circular):
> (proper-list-p '#1=(1 . #1#))
nil
t
(This version considers circular lists not to be proper: I think the other decision is less useful but perhaps equally justified in some theoretical sense.)
But this is now quadratic in the length of the list. This can be made better by using a hashtable in the obvious way, but then the implementation is ludicrously consy for small lists (hashtables are big).
Another reason is to consider the difference between representational type and intentional type: the representational type of something tells you how it is implemented, while the intentional type tells you what it logically is. And it's easy to see that, in a lisp with mutable data structures, it is absurdly difficult for the representational type of a (non-null) list to be different than that of a cons. Here's an example of why:
(defun make-list/last (length init)
;; return a list of length LENGTH, with each element being INIT,
;; and its last cons.
(labels ((mlt (n list last)
(cond ((zerop n)
(values list last))
((null last)
(let ((c (cons init nil)))
(mlt (- n 1) c c)))
(t (mlt (- n 1) (cons init list) last)))))
(mlt length '() '())))
(multiple-value-bind (list last) (make-list/last 10 3)
(values
(proper-list-p list)
(progn
(setf (cdr last) t)
(proper-list-p list))
(progn
(setf (cdr (cdr list)) '(2 3))
(proper-list-p list))))
So the result of the last form is t nil t: list is initially a proper list, then it isn't because I fiddled with its final cons, then it is again because I fiddled with some intermediate cons (and now, whatever I do to the cons bound to last will make no difference to that bound to list).
It would be insanely difficult to keep track, in terms of representational type, of whether something is a proper list or not, if you want to use anything that is remotely like linked lists. And type-of, for instance, tells you the representational type of something, which can only be cons (or null for empty lists).

Translating this to Common Lisp

I've been reading an article by Olin Shivers titled Stylish Lisp programming techniques and found the second example there (labeled "Technique n-1") a bit puzzling. It describes a self-modifying macro that looks like this:
(defun gen-counter macro (x)
(let ((ans (cadr x)))
(rplaca (cdr x)
(+ 1 ans))
ans))
It's supposed to get its calling form as argument x (i.e. (gen-counter <some-number>)). The purpose of this is to be able to do something like this:
> ;this prints out the numbers from 0 to 9.
(do ((n 0 (gen-counter 1)))
((= n 10) t)
(princ n))
0.1.2.3.4.5.6.7.8.9.T
>
The problem is that this syntax with the macro symbol after the function name is not valid in Common Lisp. I've been unsuccessfully trying to obtain similar behavior in Common Lisp. Can someone please provide a working example of analogous macro in CL?
Why the code works
First, it's useful to consider the first example in the paper:
> (defun element-generator ()
(let ((state '(() . (list of elements to be generated)))) ;() sentinel.
(let ((ans (cadr state))) ;pick off the first element
(rplacd state (cddr state)) ;smash the cons
ans)))
ELEMENT-GENERATOR
> (element-generator)
LIST
> (element-generator)
OF
> (element-generator)
This works because there's one literal list
(() . (list of elements to be generated)
and it's being modified. Note that this is actually undefined behavior in Common Lisp, but you'll get the same behavior in some Common Lisp implementations. See Unexpected persistence of data and some of the other linked questions for a discussion of what's happening here.
Approximating it in Common Lisp
Now, the paper and code you're citing actually has some useful comments about what this code is doing:
(defun gen-counter macro (x) ;X is the entire form (GEN-COUNTER n)
(let ((ans (cadr x))) ;pick the ans out of (gen-counter ans)
(rplaca (cdr x) ;increment the (gen-counter ans) form
(+ 1 ans))
ans)) ;return the answer
The way that this is working is not quite like an &rest argument, as in Rainer Joswig's answer, but actually a &whole argument, where the the entire form can be bound to a variable. This is using the source of the program as the literal value that gets destructively modified! Now, in the paper, this is used in this example:
> ;this prints out the numbers from 0 to 9.
(do ((n 0 (gen-counter 1)))
((= n 10) t)
(princ n))
0.1.2.3.4.5.6.7.8.9.T
However, in Common Lisp, we'd expect the macro to be expanded just once. That is, we expect (gen-counter 1) to be replaced by some piece of code. We can still generate a piece of code like this, though:
(defmacro make-counter (&whole form initial-value)
(declare (ignore initial-value))
(let ((text (gensym (string 'text-))))
`(let ((,text ',form))
(incf (second ,text)))))
CL-USER> (macroexpand '(make-counter 3))
(LET ((#:TEXT-1002 '(MAKE-COUNTER 3)))
(INCF (SECOND #:TEXT-1002)))
Then we can recreate the example with do
CL-USER> (do ((n 0 (make-counter 1)))
((= n 10) t)
(princ n))
023456789
Of course, this is undefined behavior, since it's modifying literal data. It won't work in all Lisps (the run above is from CCL; it didn't work in SBCL).
But don't miss the point
The whole article is sort of interesting, but recognize that it's sort of a joke, too. It's pointing out that you can do some funny things in an evaluator that doesn't compile code. It's mostly satire that's pointing out the inconsistencies of Lisp systems that have different behaviors under evaluation and compilation. Note the last paragraph:
Some short-sighted individuals will point out that these programming
techniques, while certainly laudable for their increased clarity and
efficiency, would fail on compiled code. Sadly, this is true. At least
two of the above techniques will send most compilers into an infinite
loop. But it is already known that most lisp compilers do not
implement full lisp semantics -- dynamic scoping, for instance. This
is but another case of the compiler failing to preserve semantic
correctness. It remains the task of the compiler implementor to
adjust his system to correctly implement the source language, rather
than the user to resort to ugly, dangerous, non-portable, non-robust
``hacks'' in order to program around a buggy compiler.
I hope this provides some insight into the nature of clean, elegant
Lisp programming techniques.
—Olin Shivers
Common Lisp:
(defmacro gen-counter (&rest x)
(let ((ans (car x)))
(rplaca x (+ 1 ans))
ans))
But above only works in the Interpreter, not with a compiler.
With compiled code, the macro call is gone - it is expanded away - and there is nothing to modify.
Note to unsuspecting readers: you might want to read the paper by Olin Shivers very careful and try to find out what he actually means...

SICP: Can or be defined in lisp as a syntactic transformation without gensym?

I am trying to solve the last part of question 4.4 of the Structure and Interpretation of computer programming; the task is to implement or as a syntactic transformation. Only elementary syntactic forms are defined; quote, if, begin, cond, define, apply and lambda.
(or a b ... c) is equal to the first true value or false if no value is true.
The way I want to approach it is to transform for example (or a b c) into
(if a a (if b b (if c c false)))
the problem with this is that a, b, and c would be evaluated twice, which could give incorrect results if any of them had side-effects. So I want something like a let
(let ((syma a))
(if syma syma (let ((symb b))
(if symb symb (let ((symc c))
(if (symc symc false)) )) )) )
and this in turn could be implemented via lambda as in Exercise 4.6. The problem now is determining symbols syma, symb and symc; if for example the expression b contains a reference to the variable syma, then the let will destroy the binding. Thus we must have that syma is a symbol not in b or c.
Now we hit a snag; the only way I can see out of this hole is to have symbols that cannot have been in any expression passed to eval. (This includes symbols that might have been passed in by other syntactic transformations).
However because I don't have direct access to the environment at the expression I'm not sure if there is any reasonable way of producing such symbols; I think Common Lisp has the function gensym for this purpose (which would mean sticking state in the metacircular interpreter, endangering any concurrent use).
Am I missing something? Is there a way to implement or without using gensym? I know that Scheme has it's own hygenic macro system, but I haven't grokked how it works and I'm not sure whether it's got a gensym underneath.
I think what you might want to do here is to transform to a syntactic expansion where the evaluation of the various forms aren't nested. You could do this, e.g., by wrapping each form as a lambda function and then the approach that you're using is fine. E.g., you can do turn something like
(or a b c)
into
(let ((l1 (lambda () a))
(l2 (lambda () b))
(l3 (lambda () c)))
(let ((v1 (l1)))
(if v1 v1
(let ((v2 (l2)))
(if v2 v2
(let ((v3 (l3)))
(if v3 v3
false)))))))
(Actually, the evaluation of the lambda function calls are still nested in the ifs and lets, but the definition of the lambda functions are in a location such that calling them in the nested ifs and lets doesn't cause any difficulty with captured bindings.) This doesn't address the issue of how you get the variables l1–l3 and v1–v3, but that doesn't matter so much, none of them are in scope for the bodies of the lambda functions, so you don't need to worry about whether they appear in the body or not. In fact, you can use the same variable for all the results:
(let ((l1 (lambda () a))
(l2 (lambda () b))
(l3 (lambda () c)))
(let ((v (l1)))
(if v v
(let ((v (l2)))
(if v v
(let ((v (l3)))
(if v v
false)))))))
At this point, you're really just doing loop unrolling of a more general form like:
(define (functional-or . functions)
(if (null? functions)
false
(let ((v ((first functions))))
(if v v
(functional-or (rest functions))))))
and the expansion of (or a b c) is simply
(functional-or (lambda () a) (lambda () b) (lambda () c))
This approach is also used in an answer to Why (apply and '(1 2 3)) doesn't work while (and 1 2 3) works in R5RS?. And none of this required any GENSYMing!
In SICP you are given two ways of implementing or. One that handles them as special forms which is trivial and one as derived expressions. I'm unsure if they actually thought you would see this as a problem, but you can do it by implementing gensym or altering variable? and how you make derived variables like this:
;; a unique tag to identify special variables
(define id (vector 'id))
;; a way to make such variable
(define (make-var x)
(list id x))
;; redefine variable? to handle macro-variables
(define (variable? exp)
(or (symbol? exp)
(tagged-list? exp id)))
;; makes combinations so that you don't evaluate
;; every part twice in case of side effects (set!)
(define (or->combination terms)
(if (null? terms)
'false
(let ((tmp (make-var 'tmp)))
(list (make-lambda (list tmp)
(list (make-if tmp
tmp
(or->combination (cdr terms)))))
(car terms)))))
;; My original version
;; This might not be good since it uses backquotes not introduced
;; until chapter 5 and uses features from exercise 4.6
;; Though, might be easier to read for some so I'll leave it.
(define (or->combination terms)
(if (null? terms)
'false
(let ((tmp (make-var 'tmp)))
`(let ((,tmp ,(car terms)))
(if ,tmp
,tmp
,(or->combination (cdr terms)))))))
How it works is that make-var creates a new list every time it is called, even with the same argument. Since it has id as it's first element variable? will identify it as a variable. Since it's a list it will only match in variable lookup with eq? if it is the same list, so several nested or->combination tmp-vars will all be seen as different by lookup-variable-value since (eq? (list) (list)) => #f and special variables being lists they will never shadow any symbol in code.
This is influenced by eiod, by Al Petrofsky, which implements syntax-rules in a similar manner. Unless you look at others implementations as spoilers you should give it a read.

How could I implement the push macro?

Can someone help me understand how push can be implemented as a macro? The naive version below evaluates the place form twice, and does so before evaluating the element form:
(defmacro my-push (element place)
`(setf ,place (cons ,element ,place)))
But if I try to fix this as below then I'm setf-ing the wrong place:
(defmacro my-push (element place)
(let ((el-sym (gensym))
(place-sym (gensym)))
`(let ((,el-sym ,element)
(,place-sym ,place))
(setf ,place-sym (cons ,el-sym ,place-sym)))))
CL-USER> (defparameter *list* '(0 1 2 3))
*LIST*
CL-USER> (my-push 'hi *list*)
(HI 0 1 2 3)
CL-USER> *list*
(0 1 2 3)
How can I setf the correct place without evaluating twice?
Doing this right seems to be a little more complicated. For instance, the code for push in SBCL 1.0.58 is:
(defmacro-mundanely push (obj place &environment env)
#!+sb-doc
"Takes an object and a location holding a list. Conses the object onto
the list, returning the modified list. OBJ is evaluated before PLACE."
(multiple-value-bind (dummies vals newval setter getter)
(sb!xc:get-setf-expansion place env)
(let ((g (gensym)))
`(let* ((,g ,obj)
,#(mapcar #'list dummies vals)
(,(car newval) (cons ,g ,getter))
,#(cdr newval))
,setter))))
So reading the documentation on get-setf-expansion seems to be useful.
For the record, the generated code looks quite nice:
Pushing into a symbol:
(push 1 symbol)
expands into
(LET* ((#:G906 1) (#:NEW905 (CONS #:G906 SYMBOL)))
(SETQ SYMBOL #:NEW905))
Pushing into a SETF-able function (assuming symbol points to a list of lists):
(push 1 (first symbol))
expands into
(LET* ((#:G909 1)
(#:SYMBOL908 SYMBOL)
(#:NEW907 (CONS #:G909 (FIRST #:SYMBOL908))))
(SB-KERNEL:%RPLACA #:SYMBOL908 #:NEW907))
So unless you take some time to study setf, setf expansions and company, this looks rather arcane (it may still look so even after studying them). The 'Generalized Variables' chapter in OnLisp may be useful too.
Hint: if you compile your own SBCL (not that hard), pass the --fancy argument to make.sh. This way you'll be able to quickly see the definitions of functions/macros inside SBCL (for instance, with M-. inside Emacs+SLIME). Obviously, don't delete those sources (you can run clean.sh after install.sh, to save 90% of the space).
Taking a look at how the existing one (in SBCL, at least) does things, I see:
* (macroexpand-1 '(push 1 *foo*))
(LET* ((#:G823 1) (#:NEW822 (CONS #:G823 *FOO*)))
(SETQ *FOO* #:NEW822))
T
So, I imagine, mixing in a combination of your version and what this generates, one might do:
(defmacro my-push (element place)
(let ((el-sym (gensym))
(new-sym (gensym "NEW")))
`(let* ((,el-sym ,element)
(,new-sym (cons ,el-sym ,place)))
(setq ,place ,new-sym)))))
A few observations:
This seems to work with either setq or setf. Depending on what problem you're actually trying to solve (I presume re-writing push isn't the actual end goal), you may favor one or the other.
Note that place does still get evaluated twice... though it does at least do so only after evaluating element. Is the double evaluation something you actually need to avoid? (Given that the built-in push doesn't, I'm left wondering if/how you'd be able to... though I'm writing this up before spending terribly much time thinking about it.) Given that it's something that needs to evaluate as a "place", perhaps this is normal?
Using let* instead of let allows us to use ,el-sym in the setting of ,new-sym. This moves where the cons happens, such that it's evaluated in the first evaluation of ,place, and after the evaluation of ,element. Perhaps this gets you what you need, with respect to evaluation ordering?
I think the biggest problem with your second version is that your setf really does need to operate on the symbol passed in, not on a gensym symbol.
Hopefully this helps... (I'm still somewhat new to all this myself, so I'm making some guesses here.)