Does any Lisp allow mutually recursive macros? - macros

In Common Lisp, a macro definition must have been seen before the first use. This allows a macro to refer to itself, but does not allow two macros to refer to each other. The restriction is slightly awkward, but understandable; it makes the macro system quite a bit easier to implement, and to understand how the implementation works.
Is there any Lisp family language in which two macros can refer to each other?

What is a macro?
A macro is just a function which is called on code rather than data.
E.g., when you write
(defmacro report (x)
(let ((var (gensym "REPORT-")))
`(let ((,var ,x))
(format t "~&~S=<~S>~%" ',x ,var)
,var)))
you are actually defining a function which looks something like
(defun macro-report (system::<macro-form> system::<env-arg>)
(declare (cons system::<macro-form>))
(declare (ignore system::<env-arg>))
(if (not (system::list-length-in-bounds-p system::<macro-form> 2 2 nil))
(system::macro-call-error system::<macro-form>)
(let* ((x (cadr system::<macro-form>)))
(block report
(let ((var (gensym "REPORT-")))
`(let ((,var ,x)) (format t "~&~s=<~s>~%" ',x ,var) ,var))))))
I.e., when you write, say,
(report (! 12))
lisp actually passes the form (! 12) as the 1st argument to macro-report which transforms it into:
(LET ((#:REPORT-2836 (! 12)))
(FORMAT T "~&~S=<~S>~%" '(! 12) #:REPORT-2836)
#:REPORT-2836)
and only then evaluates it to print (! 12)=<479001600> and return 479001600.
Recursion in macros
There is a difference whether a macro calls itself in implementation or in expansion.
E.g., a possible implementation of the macro and is:
(defmacro my-and (&rest args)
(cond ((null args) T)
((null (cdr args)) (car args))
(t
`(if ,(car args)
(my-and ,#(cdr args))
nil))))
Note that it may expand into itself:
(macroexpand '(my-and x y z))
==> (IF X (MY-AND Y Z) NIL) ; T
As you can see, the macroexpansion contains the macro being defined.
This is not a problem, e.g., (my-and 1 2 3) correctly evaluates to 3.
However, if we try to implement a macro using itself, e.g.,
(defmacro bad-macro (code)
(1+ (bad-macro code)))
you will get an error (a stack overflow or undefined function or ...) when you try to use it, depending on the implementation.

Here's why mutually recursive macros can't work in any useful way.
Consider what a system which wants to evaluate (or compile) Lisp code for a slightly simpler Lisp than CL (so I'm avoiding some of the subtleties that happen in CL), such as the definition of a function, needs to do. It has a very small number of things it knows how to do:
it knows how to call functions;
it knows how to evaluate a few sorts of literal objects;
it has some special rules for a few sorts of forms – what CL calls 'special forms', which (again in CL-speak) are forms whose car is a special operator;
finally it knows how to look to see whether forms correspond to functions which it can call to transform the code it is trying to evaluate or compile – some of these functions are predefined but additional ones can be defined.
So the way the evaluator works is by walking over the thing it needs to evaluate looking for these source-code-transforming things, aka macros (the last case), calling their functions and then recursing on the results until it ends up with code which has none left. What's left should consist only of instances of the first three cases, which it then knows how to deal with.
So now think about what the evaluator has to do if it is evaluating the definition of the function corresponding to a macro, called a. In Cl-speak it is evaluating or compiling a's macro function (which you can get at via (macro-function 'a) in CL). Let's assume that at some point there is a form (b ...) in this code, and that b is known also to correspond to a macro.
So at some point it comes to (b ...), and it knows that in order to do this it needs to call b's macro function. It binds suitable arguments and now it needs to evaluate the definition of the body of that function ...
... and when it does this it comes across an expression like (a ...). What should it do? It needs to call a's macro function, but it can't, because it doesn't yet know what it is, because it's in the middle of working that out: it could start trying to work it out again, but this is just a loop: it's not going to get anywhere where it hasn't already been.
Well, there's a horrible trick you could do to avoid this. The infinite regress above happens because the evaluator is trying to expand all of the macros ahead of time, and so there's no base to the recursion. But let's assume that the definition of a's macro function has code which looks like this:
(if <something>
(b ...)
<something not involving b>)
Rather than doing the expand-all-the-macros-first trick, what you could do is to expand only the macros you need, just before you need their results. And if <something> turned out always to be false, then you never need to expand (b ...), so you never get into this vicious loop: the recursion bottoms out.
But this means you must always expand macros on demand: you can never do it ahead of time, and because macros expand to source code you can never compile. In other words a strategy like this is not compatible with compilation. It also means that if <something> ever turns out to be true then you'll end up in the infinite regress again.
Note that this is completely different to macros which expand to code which involves the same macro, or another macro which expands into code which uses it. Here's a definition of a macro called et which does that (it doesn't need to do this of course, this is just to see it happen):
(defmacro et (&rest forms)
(if (null forms)
't
`(et1 ,(first forms) ,(rest forms))))
(defmacro et1 (form more)
(let ((rn (make-symbol "R")))
`(let ((,rn ,form))
(if ,rn
,rn
(et ,#more)))))
Now (et a b c) expands to (et1 a (b c)) which expands to (let ((#:r a)) (if #:r #:r (et b c))) (where all the uninterned things are the same thing) and so on until you get
(let ((#:r a))
(if #:r
#:r
(let ((#:r b))
(if #:r
#:r
(let ((#:r c))
(if #:r
#:r
t))))))
Where now not all the uninterned symbols are the same
And with a plausible macro for let (let is in fact a special operator in CL) this can get turned even further into
((lambda (#:r)
(if #:r
#:r
((lambda (#:r)
(if #:r
#:r
((lambda (#:r)
(if #:r
#:r
t))
c)))
b)))
a)
And this is an example of 'things the system knows how to deal with': all that's left here is variables, lambda, a primitive conditional and function calls.
One of the nice things about CL is that, although there is a lot of useful sugar, you can still poke around in the guts of things if you like. And in particular, you still see that macros are just functions that transform source code. The following does exactly what the defmacro versions do (not quite: defmacro does the necessary cleverness to make sure the macros are available early enough: I'd need to use eval-when to do that with the below):
(setf (macro-function 'et)
(lambda (expression environment)
(declare (ignore environment))
(let ((forms (rest expression)))
(if (null forms)
't
`(et1 ,(first forms) ,(rest forms))))))
(setf (macro-function 'et1)
(lambda (expression environment)
(declare (ignore environment))
(destructuring-bind (_ form more) expression
(declare (ignore _))
(let ((rn (make-symbol "R")))
`(let ((,rn ,form))
(if ,rn
,rn
(et ,#more)))))))

There have been historic Lisp systems that allow this, at least in interpreted code.
We can allow a macro to use itself for its own definition, or two or more macros to mutually use each other, if we follow an extremely late expansion strategy.
That is to say, our macro system expands a macro call just before it is evaluated (and does that each time that same expression is evaluated).
(Such a macro expansion strategy is good for interactive development with macros. If you fix a buggy macro, then all code depending on it automatically benefits from the change, without having to be re-processed in any way.)
Under such a macro system, suppose we have a conditional like this:
(if (condition)
(macro1 ...)
(macro2 ...))
When (condition) is evaluated, then if it yields true, (macro1 ...) is evaluated, otherwise (macro2 ...). But evaluation also means expansion. Thus only one of these two macros is expanded.
This is the key to why mutual references among macros can work: we are able rely on the conditional logic to give us not only conditional evaluation, but conditional expansion also, which then allows the recursion to have ways of terminating.
For example, suppose macro A's body of code is defined with the help of macro B, and vice versa. And when a particular invocation of A is executed, it happens to hit the particular case that requires B, and so that B call is expanded by invocation of macro B. B also hits the code case that depends on A, and so it recurses into A to obtain the needed expansion. But, this time, A is called in a way that avoids requiring, again, an expansion of B; it avoids evaluating any sub-expression containing the B macro. Thus, it calculates the expansion, and returns it to B, which then calculates its expansion returns to the outermost A. A finally expands and the recursion terminates; all is well.
What blocks macros from using each other is the unconditional expansion strategy: the strategy of fully expanding entire top-level forms after they are read, so that the definitions of functions and macros contain only expanded code. In that situation there is no possibility of conditional expansion that would allow for the recursion to terminate.
Note, by the way, that a macro system which expands late doesn't recursively expand macros in a macro expansion. Suppose (mac1 x y) expands into (if x (mac2 y) (mac3 y)). Well, that's all the expansion that is done for now: the if that pops out is not a macro, so expansion stops, and evaluation proceeds. If x yields true, then mac2 is expanded, and mac3 is not.

Related

How does macroexpansion actually work in Lisp?

I would like a more detailed explanation of how macro expansion works, at least in Emacs Lisp but an overview of other Lisps would be appreciated. The way I usually see it explained is that the arguments of the macro are passed unevaluated to the body, which is then executed and returns a new LISP form. However, if I do
(defun check-one (arg)
(eq arg 1))
(defmacro check-foo (checker foo)
(if (checker 1)
`(,foo "yes")
`(,foo "no")))
I would expect
(check-foo check-one print)
to first expand to
(if (check-one 1)
`(print "yes")
`(print "no))
and then finally to
(print "yes")
but instead I get a "checker" function is void error. On the other hand, if I had defined
(defmacro check-foo (checker foo)
(if (funcall checker 1)
`(,foo "yes")
`(,foo "no")))
then I would have the expected behavior. So the expressions do get replaced in the body unevaluated, but for some reason functions do not work? What is the step-by-step procedure the interpreter follows when macroexpanding? Is there a good text-book that explains this rigorously?
Macros are functions ...
A good way to think about macros is that they are simply functions, like any other function.
... which operate on source code
But they are functions whose arguments are source code, and whose value is also source code.
Looking at macro functions
Macros being functions is not quite explicit in elisp: some of the lower-level functionality is, I think, not exposed. But in Common Lisp this is quite literally how macros are implemented: a macro has an associated function, and this function gets called to expand the macro, with its value being the new source code. For instance, if you are so minded you could write macros in Common Lisp like this.
(defun expand-fn (form environment)
;; not talking about environment
(declare (ignore environment))
(let ((name (second form))
(arglist (third form))
(body (cdddr form)))
`(function (lambda ,arglist
(block ,name
,#body)))))
(setf (macro-function 'fn) #'expand-fn)
And now fn is a macro which will construct a function which 'knows its name', so you could write
(fn foo (x) ... (return-from foo x) ...)
which turns into
(function (lambda (x) (block foo ... (return-from foo x))))
In Common Lisp, defmacro is then itself a macro which arranges for a suitable macro function to be installed and also deals with making the macro available at compile time &c.
In elisp, it looks as if this lower layer is not specified by the language, but I think it's safe to assume that things work the same way.
So then the job of a macro is to take a bunch of source code and compute from it another bunch of source code which is the expansion of the macro. And of course the really neat trick is that, because source code (both arguments and values) is represented as s-expressions, Lisp is a superb language for manipulating s-expressions, you can write macros in Lisp itself.
Macroexpansion
There are a fair number of fiddly corner cases here such as local macros and so on. But here is, pretty much, how this works.
Start with some form <f>:
If <f> is (<a> ...) where <a> is a symbol, check for a macro function for <a>. If it has one, call it on the whole form, and call the value it returns <f'>: now simply recurse on <f'>.
If <f> is (<a> ...) where <a> is a symbol which names a special operator (something like if) then recurse on the subforms of the special operator which its rules say must be macroexpanded. As an example, in a form like (if <x> <y> <z>) all of <x>, <y>, & <z> need to be macroexpanded, while in (setq <a> <b>), only <b> would be subject to macroexpansion, and so on: these rules are hard-wired, which is why special operators are special.
If <f> is (<a> ...) where <a> is a symbol which is neither of the above cases, then it's a function call, and the forms in the body of the form are macroexpanded, and that's it.
If <f> is ((lambda (...) ...) ...) then the forms in the body of the lambda (but not its arguments!) are macroexpanded and then the case is the same as the last one.
Finally <f> might not be a compound form: nothing to do here.
I think that's all the cases. This is not a complete description of the process because there are complications like local macros and so on. But it's enough I think.
Order of macroexpansion
Note that macroexpansion happens 'outside in': a form like (a ...) is expanded until you get something which isn't a macro form, and only then is the body, perhaps, expanded. That's because, until the macro is completely expanded, you have no idea which, if any, of the subforms are even eligible for macroexpansion.
Your code
My guess is that what you want to happen is that (check-foo bog foo) should turn into (if (bog 1) (foo yes) (foo no)). So the way to get this is that this form is what the macro function needs to return. We could write this using the CL low-level facilities:
(defun check-foo-expander (form environment)
;; form is like (check-foo pred-name function-name)
(declare (ignore environment)) ;still not talking about environment
`(if (,(second form) 1)
(,(third form) "yes")
(,(third form) "no")))
And we can check:
> (check-foo-expander '(check-foo bog foo) nil)
(if (bog 1) (foo "yes") (foo "no"))
And then install it as a macro:
> (setf (macro-function 'check-foo) #'check-foo-expander)
And now
> (check-foo evenp print)
"no"
"no"
> (check-foo oddp print)
"yes"
"yes"
But it's easier to write it using defmacro:
(defmacro check-foo (predicate function)
`(if (,predicate 1)
(,function "yes")
(,function "no")))
This is the same thing (more-or-less), but easier to read.

Lisp changes function to lambda expression when stored in function cell

In this post, I ask tangentially why when I declare in SBCL
(defun a (&rest x)
x)
and then check what the function cell holds
(describe 'a)
COMMON-LISP-USER::A
[symbol]
A names a compiled function:
Lambda-list: (&REST X)
Derived type: (FUNCTION * (VALUES LIST &OPTIONAL))
Source form:
(LAMBDA (&REST X) (BLOCK A X))
I see this particular breakdown of the original function. Could someone explain what this output means? I'm especially confused by the last line
Source form:
(LAMBDA (&REST X) (BLOCK A X))
This is mysterious because for some reason not clear to me Lisp has transformed the original function into a lambda expression. It would also be nice to know the details of how a function broken down like this is then called. This example is SBCL. In Elisp
(symbol-function 'a)
gives
(lambda (&rest x) x)
again, bizarre. As I said in the other post, this is easier to understand in Scheme -- but that created confusion in the answers. So once more I ask, Why has Lisp taken a normal function declaration and seemingly stored it as a lambda expression?
I'm still a bit unclear what you are confused about, but here is an attempt to explain it. I will stick to CL (and mostly to ANSI CL), because elisp has a lot of historical oddities which just make things hard to understand (there is an appendix on elisp). Pre-ANSI CL was also a lot less clear on various things.
I'll try to explain things by writing a macro which is a simple version of defun: I'll call this defun/simple, and an example of its use will be
(defun/simple foo (x)
(+ x x))
So what I need to do is to work out what the expansion of this macro should be, so that it does something broadly equivalent (but simpler than) defun.
The function namespace & fdefinition
First of all I assume you are comfortable with the idea that, in CL (and elisp) the namespace of functions is different than the namespace of variable bindings: both languages are lisp-2s. So in a form like (f x), f is looked up in the namespace of function bindings, while x is looked up in the namespace of variable bindings. This means that forms like
(let ((sin 0.0))
(sin sin))
are fine in CL or elisp, while in Scheme they would be an error, as 0.0 is not a function, because Scheme is a lisp-1.
So we need some way of accessing that namespace, and in CL the most general way of doing that is fdefinition: (fdefinition <function name>) gets the function definition of <function name>, where <function name> is something which names a function, which for our purposes will be a symbol.
fdefinition is what CL calls an accessor: this means that the setf macro knows what to do with it, so that we can mutate the function binding of a symbol by (setf (fdefinition ...) ...). (This is not true: what we can access and mutate with fdefinition is the top-level function binding of a symbol, we can't access or mutate lexical function bindings, and CL provides no way to do this, but this does not matter here.)
So this tells us what our macro expansion needs to look like: we want to set the (top-level) definition of the name to some function object. The expansion of the macro should be like this:
(defun/simple foo (x)
x)
should expand to something involving
(setf (fdefinition 'foo) <form which makes a function>)
So we can write this bit of the macro now:
(defmacro defun/simple (name arglist &body forms)
`(progn
(setf (fdefinition ',name)
,(make-function-form name arglist forms))
',name))
This is the complete definition of this macro. It uses progn in its expansion so that the result of expanding it is the name of the function being defined, which is the same as defun: the expansion does all its real work by side-effect.
But defun/simple relies on a helper function, called make-function-form, which I haven't defined yet, so you can't actually use it yet.
Function forms
So now we need to write make-function-form. This function is called at macroexpansion time: it's job is not to make a function: it's to return a bit of source code which will make a function, which I'm calling a 'function form'.
So, what do function forms look like in CL? Well, there's really only one such form in portable CL (this might be wrong, but I think it is true), which is a form constructed using the special operator function. So we're going to need to return some form which looks like (function ...). Well, what can ... be? There are two cases for function.
(function <name>) denotes the function named by <name> in the current lexical environment. So (function car) is the function we call when we say (car x).
(function (lambda ...)) denotes a function specified by (lambda ...): a lambda expression.
The second of these is the only (caveats as above) way we can construct a form which denotes a new function. So make-function-form is going to need to return this second variety of function form.
So we can write an initial version of make-function-form:
(defun make-function-form (name arglist forms)
(declare (ignore name))
`(function (lambda ,arglist ,#forms)))
And this is enough for defun/simple to work:
> (defun/simple plus/2 (a b)
(+ a b))
plus/2
> (plus/2 1 2)
3
But it's not quite right yet: one of the things that functions defined by defun can do is return from themselves: they know their own name and can use return-from to return from it:
> (defun silly (x)
(return-from silly 3)
(explode-the-world x))
silly
> (silly 'yes)
3
defun/simple can't do this, yet. To do this, make-function-form needs to insert a suitable block around the body of the function:
(defun make-function-form (name arglist forms)
`(function (lambda ,arglist
(block ,name
,#forms))))
And now:
> (defun/simple silly (x)
(return-from silly 3)
(explode-the-world x))
silly
> (silly 'yes)
3
And all is well.
This is the final definition of defun/simple and its auxiliary function.
Looking at the expansion of defun/simple
We can do this with macroexpand in the usual way:
> (macroexpand '(defun/simple foo (x) x))
(progn
(setf (fdefinition 'foo)
#'(lambda (x)
(block foo
x)))
'foo)
t
The only thing that's confusing here is that, because (function ...) is common in source code, there's syntactic sugar for it which is #'...: this is the same reason that quote has special syntax.
It's worth looking at the macroexpansion of real defun forms: they usually have a bunch of implementation-specific stuff in them, but you can find the same thing there. Here's an example from LW:
> (macroexpand '(defun foo (x) x))
(compiler-let ((dspec::*location* '(:inside (defun foo) :listener)))
(compiler::top-level-form-name (defun foo)
(dspec:install-defun 'foo
(dspec:location)
#'(lambda (x)
(declare (system::source-level
#<eq Hash Table{0} 42101FCD5B>))
(declare (lambda-name foo))
x))))
t
Well, there's a lot of extra stuff in here, and LW obviously has some trick around this (declare (lambda-name ...)) form which lets return-from work without an explicit block. But you can see that basically the same thing is going on.
Conclusion: how you make functions
In conclusion: a macro like defun, or any other function-defining form, needs to expand to a form which, when evaluated, will construct a function. CL offers exactly one such form: (function (lambda ...)): that's how you make functions in CL. So something like defun necessarily has to expand to something like this. (To be precise: any portable version of defun: implementations are somewhat free to do implementation-magic & may do so. However they are not free to add a new special operator.)
What you are seeing when you call describe is that, after SBCL has compiled your function, it's remembered what the source form was, and the source form was exactly the one you would have got from the defun/simple macro given here.
Notes
lambda as a macro
In ANSI CL, lambda is defined as a macro whose expansion is a suitable (function (lambda ...)) form:
> (macroexpand '(lambda (x) x))
#'(lambda (x) x)
t
> (car (macroexpand '(lambda (x) x)))
function
This means that you don't have to write (function (lambda ...)) yourself: you can rely on the macro definition of lambda doing it for you. Historically, lambda wasn't always a macro in CL: I can't find my copy of CLtL1, but I'm pretty certain it was not defined as one there. I'm reasonably sure that the macro definition of lambda arrived so that it was possible to write ISLisp-compatible programs on top of CL. It has to be in the language because lambda is in the CL package and so users can't portably define macros for it (although quite often they did define such a macro, or at least I did). I have not relied on this macro definition above.
defun/simple does not purport to be a proper clone of defun: its only purpose is to show how such a macro can be written. In particular it doesn't deal with declarations properly, I think: they need to be lifted out of the block & are not.
Elisp
Elisp is much more horrible than CL. In particular, in CL there is a well-defined function type, which is disjoint from lists:
> (typep '(lambda ()) 'function)
nil
> (typep '(lambda ()) 'list)
t
> (typep (function (lambda ())) 'function)
t
> (typep (function (lambda ())) 'list)
nil
(Note in particular that (function (lambda ())) is a function, not a list: function is doing its job of making a function.)
In elisp, however, an interpreted function is just a list whose car is lambda (caveat: if lexical binding is on this is not the case: it's then a list whose car is closure). So in elisp (without lexical binding):
ELISP> (function (lambda (x) x))
(lambda (x)
x)
And
ELISP> (defun foo (x) x)
foo
ELISP> (symbol-function 'foo)
(lambda (x)
x)
The elisp intepreter then just interprets this list, in just the way you could yourself. function in elisp is almost the same thing as quote.
But function isn't quite the same as quote in elisp: the byte-compiler knows that, when it comes across a form like (function (lambda ...)) that this is a function form, and it should byte-compile the body. So, we can look at the expansion of defun in elisp:
ELISP> (macroexpand '(defun foo (x) x))
(defalias 'foo
#'(lambda (x)
x))
(It turns out that defalias is the primitive thing now.)
But if I put this definition in a file, which I byte compile and load, then:
ELISP> (symbol-function 'foo)
#[(x)
"\207"
[x]
1]
And you can explore this a bit further: if you put this in a file:
(fset 'foo '(lambda (x) x))
and then byte compile and load that, then
ELISP> (symbol-function 'foo)
(lambda (x)
x)
So the byte compiler didn't do anything with foo because it didn't get the hint that it should. But foo is still a fine function:
ELISP> (foo 1)
1 (#o1, #x1, ?\C-a)
It just isn't compiled. This is also why, if writing elisp code with anonymous functions in it, you should use function (or equivalently #'). (And finally, of course, (function ...) does the right thing if lexical scoping is on.)
Other ways of making functions in CL
Finally, I've said above that function & specifically (function (lambda ...)) is the only primitive way to make new functions in CL. I'm not completely sure that's true, especially given CLOS (almost any CLOS will have some kind of class instances of which are functions but which can be subclassed). But it does not matter: it is a way and that's sufficient.
DEFUN is a defining macro. Macros transform code.
In Common Lisp:
(defun foo (a)
(+ a 42))
Above is a definition form, but it will be transformed by DEFUN into some other code.
The effect is similar to
(setf (symbol-function 'foo)
(lambda (a)
(block foo
(+ a 42))))
Above sets the function cell of the symbol FOO to a function. The BLOCK construct is added by SBCL, since in Common Lisp named functions defined by DEFUN create a BLOCK with the same name as the function name. This block name can then be used by RETURN-FROM to enable a non-local return from a specific function.
Additionally DEFUN does implementation specific things. Implementations also record development information: the source code, the location of the definition, etc.
Scheme has DEFINE:
(define (foo a)
(+ a 10))
This will set FOO to a function object.

Custom self-quoting forms: Useful?

Lisps often declare, that certain types are self-evaluating. E.g. in emacs-lisp numbers, "strings", :keyword-symbols and some more evaluate to themselves.
Or, more specifically: Evaluating the form and evaluating the result again gives the same result.
It is also possible to create custom self-evaluating forms, e.g.
(defun my-list (&rest args)
(cons 'my-list (mapcar (lambda (e) (list 'quote e)) args)))
(my-list (+ 1 1) 'hello)
=> (my-list '2 'hello)
(eval (my-list (+ 1 1) 'hello))
=> (my-list '2 'hello)
Are there any practical uses for defining such forms or is this more of an esoteric concept?
I thought of creating "custom-types" as self-evaluating forms, where the evaluation may for instance perform type-checks on the arguments. When trying to use such types in my code, I usually found it inconvenient compared to simply working e.g. with plists though.
*edit* I checked again, and it seems I mixed up "self-evaluating" and "self-quoting". In emacs lisp the later term was applied to the lambda form, at least in contexts without lexical binding. Note that the lambda form does never evaluate to itself (eq), even if the result is equal.
(setq form '(lambda () 1)) ;; => (lambda () 1)
(equal form (eval form)) ;; => t
(equal (eval form) (eval (eval form))) ;; => t
(eq form (eval form)) ;; => nil
(eq (eval form) (eval (eval form))) ;; => nil
As Joshua put it in his answer: Fixed-points of the eval function (with respect to equal).
The code you presented doesn't define a type of self-evaluating form. A self evaluating form that eval would return when passed as an argument. Let's take a closer look. First, there's a function that takes some arguments and returns a new list:
(defun my-list (&rest args)
(cons 'my-list (mapcar (lambda (e) (list 'quote e)) args)))
The new list has the symbol my-list as the first elements. The remaining elements are two-element lists containing the symbol quote and the elements passed to the function:
(my-list (+ 1 1) 'hello)
;=> (my-list '2 'hello)
Now, this does give you a fixed point for eval with regard to equal, since
(eval (my-list (+ 1 1) 'hello))
;=> (my-list '2 'hello)
and
(eval (eval (my-list (+ 1 1) 'hello)))
;=> (my-list '2 'hello)
It's also the case that self-evaluating forms are fixed points with respect to equals, but in Common Lisp, a self-evaluating form is one that is a fixed point for eval with respect to eq (or perhaps eql).
The point of the language specifying self-evaluating forms is really to define what the evaluator has to do with forms. Conceptually eval would be defined something like this:
(defun self-evaluating-p (form)
(or (numberp form)
(stringp form)
(and (listp form)
(eql 2 (length form))
(eq 'quote (first form)))
; ...
))
(defun eval (form)
(cond
((self-evaluating-p form) form)
((symbolp form) (symbol-value-in-environment form))
;...
))
The point is not that a self-evaluating form is one that evaluates to an equivalent (for some equivalence relation) value, but rather one for which eval doesn't have to do any work.
Compiler Macros
While there's generally not a whole lot of use for forms that evaluate to themselves (modulo some equivalence) relation, there is one very important place where something very similar is used Common Lisp: compiler macros (emphasis added):
3.2.2.1 Compiler Macros
The function returned by compiler-macro-function is a function of two
arguments, called the expansion function. To expand a compiler macro,
the expansion function is invoked by calling the macroexpand hook with
the expansion function as its first argument, the entire compiler
macro form as its second argument, and the current compilation
environment (or with the current lexical environment, if the form is
being processed by something other than compile-file) as its third
argument. The macroexpand hook, in turn, calls the expansion function
with the form as its first argument and the environment as its second
argument. The return value from the expansion function, which is
passed through by the macroexpand hook, might either be the same form,
or else a form that can, at the discretion of the code doing the
expansion, be used in place of the original form.
Macro DEFINE-COMPILER-MACRO
Unlike an ordinary macro, a compiler macro can decline to provide an expansion merely by returning a form that is the same as the original
(which can be obtained by using &whole).
As an example:
(defun exponent (base power)
"Just like CL:EXPT, but with a longer name."
(expt base power))
(define-compiler-macro exponent (&whole form base power)
"A compiler macro that replaces `(exponent base 2)` forms
with a simple multiplication. Other invocations are left the same."
(if (eql power 2)
(let ((b (gensym (string '#:base-))))
`(let ((,b ,base))
(* ,b ,b)))
form))
Note that this isn't quite the same as a self-evaluating form, because the compiler is still going through the process of checking whether a form is a cons whose car has an associated compiler macro, and then calling that compiler macro function with the form. But it's similar in that the form goes to something and the case where the same form comes back is important.
What you describe and self-evaluating forms (not types!) is unrelated.
? (list (foo (+ 1 2)))
may evaluate to
-> (foo 3)
But that's running the function foo and it is returning some list with the symbol foo and its first argument value. Nothing more. You've written a function. But not a custom self evaluating form.
A form is some data meant to be evaluated. It needs to be valid Lisp code.
About Evaluation of Forms:
Evaluation of forms is a topic when you have source like this:
(defun foo ()
(list #(1 2 3)))
What's with the above vector? Does (foo) return a list with the vector as its first element?
In Common Lisp such vector forms are self-evaluating. In some other Lisps it was different. In some older Lisp dialect one probably had to write the code below to make the compiler happy. It might even be different with an interpreter. (I've seen this loooong ago in some implementation of a variant of Standard Lisp).
(defun foo ()
(list '#(1 2 3))) ; a vector form quoted
Note the quote. Non-self evaluating forms had to be quoted. That's relatively easy to do. You have to look at the source code and make sure that such forms are quoted. But there is another problem which makes it more difficult. Such data objects could have been introduced by macros in the code. Thus one also had to make sure that all code generated by macros has all literal data quoted. Which makes it a real pain.
This was wrong in some other Lisp dialect (not in Common Lisp):
(defmacro foo (a)
(list 'list a #(1 2 3)))
or even (note the added quote)
(defmacro foo (a)
(list 'list a '#(1 2 3)))
Using
(foo 1)
would be the code (list 1 #(1 2 3)). But in these Lisps there would be a quote missing... so it was wrong there.
One had to write:
(defmacro foo (a)
(list 'list a ''#(1 2 3))) ; note the double quote
Thus
(foo 1)
would be the code (list 1 '#(1 2 3)). Which then works.
To get rid of such problems, Lisp dialects like Common Lisp required that all forms other than symbols and conses are self evaluating. See the CL standard: Self-Evaluating Objects. This is also independent of using an interpreter or compiler.
Note that Common Lisp also provides no mechanism to change that.
What could be done with a custom mechanim? One could let data forms evaluate to something different. Or one could implement different evaluation schemes. But there is nothing like that in Common Lisp. Basically we've got symbols as variables, conses as special forms / functions / macros and the rest is self-evaluating. For anything different you would need to write a custom evaluator/compiler.

SICP: Can or be defined in lisp as a syntactic transformation without gensym?

I am trying to solve the last part of question 4.4 of the Structure and Interpretation of computer programming; the task is to implement or as a syntactic transformation. Only elementary syntactic forms are defined; quote, if, begin, cond, define, apply and lambda.
(or a b ... c) is equal to the first true value or false if no value is true.
The way I want to approach it is to transform for example (or a b c) into
(if a a (if b b (if c c false)))
the problem with this is that a, b, and c would be evaluated twice, which could give incorrect results if any of them had side-effects. So I want something like a let
(let ((syma a))
(if syma syma (let ((symb b))
(if symb symb (let ((symc c))
(if (symc symc false)) )) )) )
and this in turn could be implemented via lambda as in Exercise 4.6. The problem now is determining symbols syma, symb and symc; if for example the expression b contains a reference to the variable syma, then the let will destroy the binding. Thus we must have that syma is a symbol not in b or c.
Now we hit a snag; the only way I can see out of this hole is to have symbols that cannot have been in any expression passed to eval. (This includes symbols that might have been passed in by other syntactic transformations).
However because I don't have direct access to the environment at the expression I'm not sure if there is any reasonable way of producing such symbols; I think Common Lisp has the function gensym for this purpose (which would mean sticking state in the metacircular interpreter, endangering any concurrent use).
Am I missing something? Is there a way to implement or without using gensym? I know that Scheme has it's own hygenic macro system, but I haven't grokked how it works and I'm not sure whether it's got a gensym underneath.
I think what you might want to do here is to transform to a syntactic expansion where the evaluation of the various forms aren't nested. You could do this, e.g., by wrapping each form as a lambda function and then the approach that you're using is fine. E.g., you can do turn something like
(or a b c)
into
(let ((l1 (lambda () a))
(l2 (lambda () b))
(l3 (lambda () c)))
(let ((v1 (l1)))
(if v1 v1
(let ((v2 (l2)))
(if v2 v2
(let ((v3 (l3)))
(if v3 v3
false)))))))
(Actually, the evaluation of the lambda function calls are still nested in the ifs and lets, but the definition of the lambda functions are in a location such that calling them in the nested ifs and lets doesn't cause any difficulty with captured bindings.) This doesn't address the issue of how you get the variables l1–l3 and v1–v3, but that doesn't matter so much, none of them are in scope for the bodies of the lambda functions, so you don't need to worry about whether they appear in the body or not. In fact, you can use the same variable for all the results:
(let ((l1 (lambda () a))
(l2 (lambda () b))
(l3 (lambda () c)))
(let ((v (l1)))
(if v v
(let ((v (l2)))
(if v v
(let ((v (l3)))
(if v v
false)))))))
At this point, you're really just doing loop unrolling of a more general form like:
(define (functional-or . functions)
(if (null? functions)
false
(let ((v ((first functions))))
(if v v
(functional-or (rest functions))))))
and the expansion of (or a b c) is simply
(functional-or (lambda () a) (lambda () b) (lambda () c))
This approach is also used in an answer to Why (apply and '(1 2 3)) doesn't work while (and 1 2 3) works in R5RS?. And none of this required any GENSYMing!
In SICP you are given two ways of implementing or. One that handles them as special forms which is trivial and one as derived expressions. I'm unsure if they actually thought you would see this as a problem, but you can do it by implementing gensym or altering variable? and how you make derived variables like this:
;; a unique tag to identify special variables
(define id (vector 'id))
;; a way to make such variable
(define (make-var x)
(list id x))
;; redefine variable? to handle macro-variables
(define (variable? exp)
(or (symbol? exp)
(tagged-list? exp id)))
;; makes combinations so that you don't evaluate
;; every part twice in case of side effects (set!)
(define (or->combination terms)
(if (null? terms)
'false
(let ((tmp (make-var 'tmp)))
(list (make-lambda (list tmp)
(list (make-if tmp
tmp
(or->combination (cdr terms)))))
(car terms)))))
;; My original version
;; This might not be good since it uses backquotes not introduced
;; until chapter 5 and uses features from exercise 4.6
;; Though, might be easier to read for some so I'll leave it.
(define (or->combination terms)
(if (null? terms)
'false
(let ((tmp (make-var 'tmp)))
`(let ((,tmp ,(car terms)))
(if ,tmp
,tmp
,(or->combination (cdr terms)))))))
How it works is that make-var creates a new list every time it is called, even with the same argument. Since it has id as it's first element variable? will identify it as a variable. Since it's a list it will only match in variable lookup with eq? if it is the same list, so several nested or->combination tmp-vars will all be seen as different by lookup-variable-value since (eq? (list) (list)) => #f and special variables being lists they will never shadow any symbol in code.
This is influenced by eiod, by Al Petrofsky, which implements syntax-rules in a similar manner. Unless you look at others implementations as spoilers you should give it a read.

How can I destructure an &rest argument of varying length in my elisp macro?

I have a program that takes as inputs a chunk of data and a list of rules, applying both a set of standard rules and the rules given as input to the chunk of data. The size of both inputs may vary.
I want to be able to write a list of rules like this:
(rule-generating-macro
(rule-1-name rule-1-target
(rule-action-macro (progn actions more-actions)))
(rule-2-name rule-2-target
(rule-action-macro (or (action-2) (default-action))))
;; more rules
)
Right now, rules are more verbose -- they look more like
(defvar rule-list
`((rule-1-name rule-1-target
,#(rule-action-macro (progn actions more-actions)))
(rule-2-name rule-2-target
,#(rule-action-macro (or (action-2) (default-action))))
;; more rules
)
The latter form looks uglier to me, but I can't figure out how to write a macro that can handle a variable-length &rest argument, iterate over it, and return the transformed structure. Using a defun instead of a defmacro isn't really on the table because (as hopefully the example shows) I'm trying to control evaluation of the list of rules instead of evaluating the list when my program first sees it, and once you need to control evaluation, you're in defmacro territory. In this case, the thorny point is the rule-action-macro part - getting the interpreter to read that and use its expanded value has been problematic.
How can I create a macro that handles a variable-length argument so that I can write rule lists in a concise way?
defmacro will happily accept a &rest argument
(see Defining Macros for Emacs Lisp and Macro Lambda Lists for Common Lisp).
Then you can do pretty much anything you want with it in the macro body - e.g., iterate over it. Remember, macro is much more than just backquote!
E.g.:
(defmacro multidefvar (&rest vars)
(let ((forms (mapcar (lambda (var) `(defvar ,var)) vars)))
`(progn ,#forms)))
(macroexpand '(multidefvar a b c d))
==> (PROGN (DEFVAR A) (DEFVAR B) (DEFVAR C) (DEFVAR D))