Clojure Macro Expansion - macros

I am working on a macro, I am trying to figure out how to avoid expansion of certain forms, take the following and macro for example,
(defmacro and
([] true)
([x] x)
([x & next]
`(let [and# ~x]
(if and# (and ~#next) and#))))
When expanded,
(mexpand-all '(and 1 2 3))
becomes,
(let* [and__973__auto__ 1]
(if and__973__auto__
(let* [and__973__auto__ 2]
(if and__973__auto__ 3 and__973__auto__))
and__973__auto__))
In this case what I need to do is stop let from expanding into let*.

Huh? It's not clear what you mean by "stop" let from expanding. let is a macro defined in clojure.core, which the compiler knows nothing about: it only understands let*. If your macro expanded into a let which (somehow) refused to expand further, it would fail to compile.
If you want to inspect just the output of your macro in isolation, without worrying about recursively expanding it, you should use macroexpand or macroexpand-1 instead of this mexpand-all thing. I don't know where mexpand-all comes from, but when I need something like that I use clojure.walk/macroexpand-all.

Recursive macro-expansion works by repeatedly expanding the form until there is no macro to expand. That means that if you want to recursively expand a macro, but ignore certain forms, you'll have to either code your own custom expander, or find someone else's.
Here's a quick example:
(defn my-expander [form]
(cond (not (list? form)) (mexpand-1 form)
(= (first form) 'let) form
:else (map my-expander (mexpand-1 form))))
Please forgive me if I made any mistakes. I'm much stronger with Scheme and CL than Clojure.
--Edit--
Note that the above function will not expand the subforms of a let statement, either.

Use macroexpand-1 to perform a single level of macro expansion.
After loading your and macro, this expression:
user=> (macroexpand-1 '(and 1 2 3))
Yields:
(clojure.core/let [and__1__auto__ 1] (if and__1__auto__ (clojure.core/and 2 3) and__1__auto__))

Related

How does macroexpansion actually work in Lisp?

I would like a more detailed explanation of how macro expansion works, at least in Emacs Lisp but an overview of other Lisps would be appreciated. The way I usually see it explained is that the arguments of the macro are passed unevaluated to the body, which is then executed and returns a new LISP form. However, if I do
(defun check-one (arg)
(eq arg 1))
(defmacro check-foo (checker foo)
(if (checker 1)
`(,foo "yes")
`(,foo "no")))
I would expect
(check-foo check-one print)
to first expand to
(if (check-one 1)
`(print "yes")
`(print "no))
and then finally to
(print "yes")
but instead I get a "checker" function is void error. On the other hand, if I had defined
(defmacro check-foo (checker foo)
(if (funcall checker 1)
`(,foo "yes")
`(,foo "no")))
then I would have the expected behavior. So the expressions do get replaced in the body unevaluated, but for some reason functions do not work? What is the step-by-step procedure the interpreter follows when macroexpanding? Is there a good text-book that explains this rigorously?
Macros are functions ...
A good way to think about macros is that they are simply functions, like any other function.
... which operate on source code
But they are functions whose arguments are source code, and whose value is also source code.
Looking at macro functions
Macros being functions is not quite explicit in elisp: some of the lower-level functionality is, I think, not exposed. But in Common Lisp this is quite literally how macros are implemented: a macro has an associated function, and this function gets called to expand the macro, with its value being the new source code. For instance, if you are so minded you could write macros in Common Lisp like this.
(defun expand-fn (form environment)
;; not talking about environment
(declare (ignore environment))
(let ((name (second form))
(arglist (third form))
(body (cdddr form)))
`(function (lambda ,arglist
(block ,name
,#body)))))
(setf (macro-function 'fn) #'expand-fn)
And now fn is a macro which will construct a function which 'knows its name', so you could write
(fn foo (x) ... (return-from foo x) ...)
which turns into
(function (lambda (x) (block foo ... (return-from foo x))))
In Common Lisp, defmacro is then itself a macro which arranges for a suitable macro function to be installed and also deals with making the macro available at compile time &c.
In elisp, it looks as if this lower layer is not specified by the language, but I think it's safe to assume that things work the same way.
So then the job of a macro is to take a bunch of source code and compute from it another bunch of source code which is the expansion of the macro. And of course the really neat trick is that, because source code (both arguments and values) is represented as s-expressions, Lisp is a superb language for manipulating s-expressions, you can write macros in Lisp itself.
Macroexpansion
There are a fair number of fiddly corner cases here such as local macros and so on. But here is, pretty much, how this works.
Start with some form <f>:
If <f> is (<a> ...) where <a> is a symbol, check for a macro function for <a>. If it has one, call it on the whole form, and call the value it returns <f'>: now simply recurse on <f'>.
If <f> is (<a> ...) where <a> is a symbol which names a special operator (something like if) then recurse on the subforms of the special operator which its rules say must be macroexpanded. As an example, in a form like (if <x> <y> <z>) all of <x>, <y>, & <z> need to be macroexpanded, while in (setq <a> <b>), only <b> would be subject to macroexpansion, and so on: these rules are hard-wired, which is why special operators are special.
If <f> is (<a> ...) where <a> is a symbol which is neither of the above cases, then it's a function call, and the forms in the body of the form are macroexpanded, and that's it.
If <f> is ((lambda (...) ...) ...) then the forms in the body of the lambda (but not its arguments!) are macroexpanded and then the case is the same as the last one.
Finally <f> might not be a compound form: nothing to do here.
I think that's all the cases. This is not a complete description of the process because there are complications like local macros and so on. But it's enough I think.
Order of macroexpansion
Note that macroexpansion happens 'outside in': a form like (a ...) is expanded until you get something which isn't a macro form, and only then is the body, perhaps, expanded. That's because, until the macro is completely expanded, you have no idea which, if any, of the subforms are even eligible for macroexpansion.
Your code
My guess is that what you want to happen is that (check-foo bog foo) should turn into (if (bog 1) (foo yes) (foo no)). So the way to get this is that this form is what the macro function needs to return. We could write this using the CL low-level facilities:
(defun check-foo-expander (form environment)
;; form is like (check-foo pred-name function-name)
(declare (ignore environment)) ;still not talking about environment
`(if (,(second form) 1)
(,(third form) "yes")
(,(third form) "no")))
And we can check:
> (check-foo-expander '(check-foo bog foo) nil)
(if (bog 1) (foo "yes") (foo "no"))
And then install it as a macro:
> (setf (macro-function 'check-foo) #'check-foo-expander)
And now
> (check-foo evenp print)
"no"
"no"
> (check-foo oddp print)
"yes"
"yes"
But it's easier to write it using defmacro:
(defmacro check-foo (predicate function)
`(if (,predicate 1)
(,function "yes")
(,function "no")))
This is the same thing (more-or-less), but easier to read.

Does any Lisp allow mutually recursive macros?

In Common Lisp, a macro definition must have been seen before the first use. This allows a macro to refer to itself, but does not allow two macros to refer to each other. The restriction is slightly awkward, but understandable; it makes the macro system quite a bit easier to implement, and to understand how the implementation works.
Is there any Lisp family language in which two macros can refer to each other?
What is a macro?
A macro is just a function which is called on code rather than data.
E.g., when you write
(defmacro report (x)
(let ((var (gensym "REPORT-")))
`(let ((,var ,x))
(format t "~&~S=<~S>~%" ',x ,var)
,var)))
you are actually defining a function which looks something like
(defun macro-report (system::<macro-form> system::<env-arg>)
(declare (cons system::<macro-form>))
(declare (ignore system::<env-arg>))
(if (not (system::list-length-in-bounds-p system::<macro-form> 2 2 nil))
(system::macro-call-error system::<macro-form>)
(let* ((x (cadr system::<macro-form>)))
(block report
(let ((var (gensym "REPORT-")))
`(let ((,var ,x)) (format t "~&~s=<~s>~%" ',x ,var) ,var))))))
I.e., when you write, say,
(report (! 12))
lisp actually passes the form (! 12) as the 1st argument to macro-report which transforms it into:
(LET ((#:REPORT-2836 (! 12)))
(FORMAT T "~&~S=<~S>~%" '(! 12) #:REPORT-2836)
#:REPORT-2836)
and only then evaluates it to print (! 12)=<479001600> and return 479001600.
Recursion in macros
There is a difference whether a macro calls itself in implementation or in expansion.
E.g., a possible implementation of the macro and is:
(defmacro my-and (&rest args)
(cond ((null args) T)
((null (cdr args)) (car args))
(t
`(if ,(car args)
(my-and ,#(cdr args))
nil))))
Note that it may expand into itself:
(macroexpand '(my-and x y z))
==> (IF X (MY-AND Y Z) NIL) ; T
As you can see, the macroexpansion contains the macro being defined.
This is not a problem, e.g., (my-and 1 2 3) correctly evaluates to 3.
However, if we try to implement a macro using itself, e.g.,
(defmacro bad-macro (code)
(1+ (bad-macro code)))
you will get an error (a stack overflow or undefined function or ...) when you try to use it, depending on the implementation.
Here's why mutually recursive macros can't work in any useful way.
Consider what a system which wants to evaluate (or compile) Lisp code for a slightly simpler Lisp than CL (so I'm avoiding some of the subtleties that happen in CL), such as the definition of a function, needs to do. It has a very small number of things it knows how to do:
it knows how to call functions;
it knows how to evaluate a few sorts of literal objects;
it has some special rules for a few sorts of forms – what CL calls 'special forms', which (again in CL-speak) are forms whose car is a special operator;
finally it knows how to look to see whether forms correspond to functions which it can call to transform the code it is trying to evaluate or compile – some of these functions are predefined but additional ones can be defined.
So the way the evaluator works is by walking over the thing it needs to evaluate looking for these source-code-transforming things, aka macros (the last case), calling their functions and then recursing on the results until it ends up with code which has none left. What's left should consist only of instances of the first three cases, which it then knows how to deal with.
So now think about what the evaluator has to do if it is evaluating the definition of the function corresponding to a macro, called a. In Cl-speak it is evaluating or compiling a's macro function (which you can get at via (macro-function 'a) in CL). Let's assume that at some point there is a form (b ...) in this code, and that b is known also to correspond to a macro.
So at some point it comes to (b ...), and it knows that in order to do this it needs to call b's macro function. It binds suitable arguments and now it needs to evaluate the definition of the body of that function ...
... and when it does this it comes across an expression like (a ...). What should it do? It needs to call a's macro function, but it can't, because it doesn't yet know what it is, because it's in the middle of working that out: it could start trying to work it out again, but this is just a loop: it's not going to get anywhere where it hasn't already been.
Well, there's a horrible trick you could do to avoid this. The infinite regress above happens because the evaluator is trying to expand all of the macros ahead of time, and so there's no base to the recursion. But let's assume that the definition of a's macro function has code which looks like this:
(if <something>
(b ...)
<something not involving b>)
Rather than doing the expand-all-the-macros-first trick, what you could do is to expand only the macros you need, just before you need their results. And if <something> turned out always to be false, then you never need to expand (b ...), so you never get into this vicious loop: the recursion bottoms out.
But this means you must always expand macros on demand: you can never do it ahead of time, and because macros expand to source code you can never compile. In other words a strategy like this is not compatible with compilation. It also means that if <something> ever turns out to be true then you'll end up in the infinite regress again.
Note that this is completely different to macros which expand to code which involves the same macro, or another macro which expands into code which uses it. Here's a definition of a macro called et which does that (it doesn't need to do this of course, this is just to see it happen):
(defmacro et (&rest forms)
(if (null forms)
't
`(et1 ,(first forms) ,(rest forms))))
(defmacro et1 (form more)
(let ((rn (make-symbol "R")))
`(let ((,rn ,form))
(if ,rn
,rn
(et ,#more)))))
Now (et a b c) expands to (et1 a (b c)) which expands to (let ((#:r a)) (if #:r #:r (et b c))) (where all the uninterned things are the same thing) and so on until you get
(let ((#:r a))
(if #:r
#:r
(let ((#:r b))
(if #:r
#:r
(let ((#:r c))
(if #:r
#:r
t))))))
Where now not all the uninterned symbols are the same
And with a plausible macro for let (let is in fact a special operator in CL) this can get turned even further into
((lambda (#:r)
(if #:r
#:r
((lambda (#:r)
(if #:r
#:r
((lambda (#:r)
(if #:r
#:r
t))
c)))
b)))
a)
And this is an example of 'things the system knows how to deal with': all that's left here is variables, lambda, a primitive conditional and function calls.
One of the nice things about CL is that, although there is a lot of useful sugar, you can still poke around in the guts of things if you like. And in particular, you still see that macros are just functions that transform source code. The following does exactly what the defmacro versions do (not quite: defmacro does the necessary cleverness to make sure the macros are available early enough: I'd need to use eval-when to do that with the below):
(setf (macro-function 'et)
(lambda (expression environment)
(declare (ignore environment))
(let ((forms (rest expression)))
(if (null forms)
't
`(et1 ,(first forms) ,(rest forms))))))
(setf (macro-function 'et1)
(lambda (expression environment)
(declare (ignore environment))
(destructuring-bind (_ form more) expression
(declare (ignore _))
(let ((rn (make-symbol "R")))
`(let ((,rn ,form))
(if ,rn
,rn
(et ,#more)))))))
There have been historic Lisp systems that allow this, at least in interpreted code.
We can allow a macro to use itself for its own definition, or two or more macros to mutually use each other, if we follow an extremely late expansion strategy.
That is to say, our macro system expands a macro call just before it is evaluated (and does that each time that same expression is evaluated).
(Such a macro expansion strategy is good for interactive development with macros. If you fix a buggy macro, then all code depending on it automatically benefits from the change, without having to be re-processed in any way.)
Under such a macro system, suppose we have a conditional like this:
(if (condition)
(macro1 ...)
(macro2 ...))
When (condition) is evaluated, then if it yields true, (macro1 ...) is evaluated, otherwise (macro2 ...). But evaluation also means expansion. Thus only one of these two macros is expanded.
This is the key to why mutual references among macros can work: we are able rely on the conditional logic to give us not only conditional evaluation, but conditional expansion also, which then allows the recursion to have ways of terminating.
For example, suppose macro A's body of code is defined with the help of macro B, and vice versa. And when a particular invocation of A is executed, it happens to hit the particular case that requires B, and so that B call is expanded by invocation of macro B. B also hits the code case that depends on A, and so it recurses into A to obtain the needed expansion. But, this time, A is called in a way that avoids requiring, again, an expansion of B; it avoids evaluating any sub-expression containing the B macro. Thus, it calculates the expansion, and returns it to B, which then calculates its expansion returns to the outermost A. A finally expands and the recursion terminates; all is well.
What blocks macros from using each other is the unconditional expansion strategy: the strategy of fully expanding entire top-level forms after they are read, so that the definitions of functions and macros contain only expanded code. In that situation there is no possibility of conditional expansion that would allow for the recursion to terminate.
Note, by the way, that a macro system which expands late doesn't recursively expand macros in a macro expansion. Suppose (mac1 x y) expands into (if x (mac2 y) (mac3 y)). Well, that's all the expansion that is done for now: the if that pops out is not a macro, so expansion stops, and evaluation proceeds. If x yields true, then mac2 is expanded, and mac3 is not.

Expand a macro form completely

I'd like to learn the internals of Lisp, so I want to see how everything is implemented.
For example,
(macroexpand '(loop for i upto 10 collect i))
gives me (in SBCL)
(BLOCK NIL
(LET ((I 0))
(DECLARE (TYPE (AND NUMBER REAL) I))
(SB-LOOP::WITH-LOOP-LIST-COLLECTION-HEAD (#:LOOP-LIST-HEAD-1026
#:LOOP-LIST-TAIL-1027)
(SB-LOOP::LOOP-BODY NIL
(NIL NIL (WHEN (> I '10) (GO SB-LOOP::END-LOOP)) NIL)
((SB-LOOP::LOOP-COLLECT-RPLACD
(#:LOOP-LIST-HEAD-1026 #:LOOP-LIST-TAIL-1027)
(LIST I)))
(NIL (SB-LOOP::LOOP-REALLY-DESETQ I (1+ I))
(WHEN (> I '10) (GO SB-LOOP::END-LOOP)) NIL)
((RETURN-FROM NIL
(SB-LOOP::LOOP-COLLECT-ANSWER
#:LOOP-LIST-HEAD-1026)))))))
But LOOP-BODY, WITH-LOOP-LIST-COLLECTION-HEAD, etc. are still macros. How can I expand a macro form completely?
To see the full expansion one needs to walk the Lisp form on all levels and expand them. For this it is necessary that this so-called code walker understands Lisp syntax (and not just s-expression syntax). For example in (lambda (a b) (setf a b)), the list (a b) is a parameter list and should not be macro expanded.
Various Common Lisp implementations provide such a tool. The answer of 6502 mentions MACROEXPAND-ALL which is provided by SBCL.
If you use a development environment, it is usually provided as a command:
SLIME: M-x slime-macroexpand-all with C-c M-m
LispWorks: menu Expression > Walk or M-x Walk Form, shorter M-Sh-m.
The other answers are excellent for you question but you say you want to see how everything is implemented.
Many macros (as you know already) are implemented using macros and whilst macroexpand-all is very useful but you can lose the context of what macro was responsible for what change.
One nice middle ground (if you are using slime) is to use slime-expand-1 (C-c Enter) which shows the expansion is another buffer. You can then useslime-expand-1 inside this new buffer to expand macros in-place.
This allows you to walk the tree expanding as you read and also to use undo to close the expansions again.
For me this has been a god-send in understanding other people's macros. Hope this helps you too, have fun!
You can try to use MACROEXPAND-ALL but what you may get is not necessarily useful.
In something like LOOP the real meat is the macro itself, not the generated code.
(Note: If you're not interested in portability, SBCL provides macroexpand-all, which will do what you're after. If you're after a portable solution, read on...)
The quick-and-dirty solution would be to macroexpand the form itself, then recursively macroexpand all but the first element of the resulting list. This is an imperfect solution; it will fail completely the moment it attempts to process a let's bindings (the first argument to let, the list of bindings, is not meant to be macroexpanded, but this code will do it anyway).
;;; Quick-and-dirty macroexpand-all
(defun macroexpand* (form)
(let ((form (macroexpand form)))
(cons (car form) (mapcar #'macroexpand (cdr form)))))
A more complete solution would consider special forms specially, not macroexpanding their unevaluated arguments. I could update with such a solution, if desired.

Evaluate the arguments of a macro form

What is the best practice for selectively passing evaluated arguments to a macro form?
To elaborate: The usefulness of macros lies in its ability to receives unevaluated parameter, unlike the default evaluation rule for function forms. However, there is a legitimate use cases for evaluating macro arguments.
Consider a contrived example:
(defparameter *func-body* '((print i) (+ i 1)))
Suppose it would be nice that *func-body* could serve as the body of a macro our-defun that is defined as:
(defmacro our-defun (fun args &body body)
`(defun ,fun ,args ,#body))
So after (our-defun foo (i) (+ 1 i)), we could say (foo 1) to get 2. However, if we use (our-defun foo (i) *func-body*), the result of (foo 1) will be ((PRINT I) (+ I 1)) (i.e., the value of *func-body*). It would be nice if we can force the evaluation of *func-body* as an argument to the macro our-defun.
Currently, I can think of a technique of using compile and funcall to do this, as in
(funcall (compile nil `(lambda () (our-defun foo (i) ,#*func-body*))))
after which (our-defun 1) will print out 1 and return 2, as intended. I can think of case of making this work with eval, but I would rather stay away from eval due to its peculiarity in scoping.
This leads to my question at the begining, is there a more straightforward or native way to do this?
P.S.,
A not-so-contrived example is in the function (UPDATE-HOOK), which uses two library macros (ADD-HOOK) and (REMOVE-HOOK) and needs to evaluate its parameters. The (funcall (compile nil `(lambda () ...))) technique above is used here.
(defun update-hook (hook hook-name &optional code)
(funcall (compile nil `(lambda () (remove-hook ,hook ',hook-name))))
(unless (null code)
(compile hook-name `(lambda () ,#code))
(funcall (compile nil `(lambda () (add-hook ,hook ',hook-name))))))
That's slightly confused. A macro does not receive unevaluated parameters.
A macro gets source code and creates source code from that. Remember also that source code in Lisp is actually provided as data. The macro creates code, which evaluates some forms and some not.
Macros need to work in a compiling system. Before runtime. During compile time. All the macro sees is source code and then it creates source code from that. Think of macros as code transformations, not about evaluating arguments or not.
It would be nice if we can force the evaluation of *func-body* as an argument to the macro our-defun
That is not very clean. In a compiled system, you would need to make sure that *func-body* actually has a useful binding and that it can be resolved at COMPILE TIME.
If you have a macro like DEFUN, it makes sense to have the source code static. If you want to insert some source code into a form, then it could make sense to do that at read time:
(defun foo (i) #.`(,#*foo*))
But that's code I usually would want to avoid.
two library macros (ADD-HOOK) and (REMOVE-HOOK) and needs to evaluate its parameters.
Why should ADD-HOOK and REMOVE-HOOK be macros? If you don't have a real reason, they simply should be functions. Already since they make reuse difficult.
If you want to make ADD-HOOK and REMOVE-HOOK macros for some reason, then UPDATE-HOOK usually should be a macro, too.
The list you are giving to your macro has the form
(Quote (...))
So the list you actually want is the CADR of the list you get.

lisp macro expand with partial eval

I have following code which confuse me now, I hope some can tell me the difference and how to fix this.
(defmacro tm(a)
`(concat ,(symbol-name a)))
(defun tf(a)
(list (quote concat) (symbol-name a)))
I just think they should be the same effect, but actually they seem not.
I try to following call:
CL-USER> (tf 'foo)
(CONCAT "FOO")
CL-USER> (tm 'foo)
value 'FOO is not of the expected type SYMBOL.
[Condition of type TYPE-ERROR]
So, what's the problem?
What i want is:
(tm 'foo) ==> (CONCAT "FOO")
The first problem is that 'foo is expanded by the reader to (quote foo), which is not a symbol, but a list. The macro tries to expand (tm (quote foo)). The list (quote foo) is passed as the parameter a to the macro expansion function, which tries to get its symbol-name. A list is not a valid argument for symbol-name. Therefore, your macro expansion fails.
The second problem is that while (tm foo) (note: no quote) does expand to (concat "FOO"), this form will then be executed by the REPL, so that this is also not the same as your tf function. This is not surprising, of course, because macros do different things than functions.
First, note that
`(concat ,(symbol-name a))
and
(list (quote concat) (symbol-name a))
do the exact same thing. They are equivalent pieces of code (backquote syntax isn't restricted to macro bodies!): Both construct a list whose first element is the symbol CONCAT and whose second element is the symbol name of whatever the variable A refers to.
Clearly, this only makes sense if A refers to a symbol, which, as Svante has pointed out, isn't the case in the macro call example.
You could, of course, extract the symbol from the list (QUOTE FOO), but that prevents you from calling the macro like this:
(let ((x 'foo))
(tm x))
which raises the question of why you would event want to force the user of the macro to explicitly quote the symbol where it needs to be a literal constant anyway.
Second, the way macros work is this: They take pieces of code (such as (QUOTE FOO)) as arguments and produce a new piece of code that, upon macroexpansion, (more or less) replaces the macro call in the source code. It is often useful to reuse macro arguments within the generated code by putting them where they are going to be evaluated later, such as in
(defmacro tm2 (a)
`(print (symbol-name ,a)))
Think about what this piece of code does and whether or not my let example above works now. That should get you on the right track.
Finally, a piece of advice: Avoid macros when a function will do. It will make life much easier for both the implementer and the user.