In Common Lisp, how would you extend the existing comparator operations to work for new types? - lisp

I'm attempting to implement a lisp version of a domain specific language that uses the binary operators and comparators on discrete probability distributions. Is there a safe(ish) and concise way to extend the functionality of these atoms < <= = >= > + - / * to work with new types, without completely bricking the language?
I realize just making new operator names would be the easiest solution, but that answer is not compelling nor fun, and certainly doesn't live up to lisp's reputation as the programmable programming language.
For example, say we have the hash-table:
((4 . 1/4) (3 . 1/4) (2 . 1/4) (1 . 1/4))
which we create with the macro (d 4), and we want to be able to compare it to a number like so
(< (d 4) 3)
which we would want to return a hash-table representing a 50% chance it's true and 50% chance it's false, say in this form:
((0 . 1/2) (1 . 1/2))
but currently it produces the error:
*** - <: #S(HASH-TABLE :TEST FASTHASH-EQL (4 . 1/4) (3 . 1/4) (2 . 1/4) (1 . 1/4)) is not a real number

There are three answers to this:
no, you can't do it because + &c are not generic functions in CL and you may not redefine symbols exported from the CL package ...
... but yes, of course you can do it if you're willing to accept a small amount of inconvenience, because Lisp is a programmable programming language ...
... but no, in fact you can't completely do it if you want to preserve the semantics of your extended language in fact, and there isn't anything you can do about that.
So let's look at these in order.
(1): The arithmetic operators are not generic functions in CL ...
This means you can't extend them: they work on numbers. And you can't redefine them because the language forbids that.
(2): ... but of course you can get around that ...
So, the first trick is we want to define a package in which the arithmetic operators are not the CL versions of them. We can do this manually, but we'll use Tim Bradshaw's conduits system to make life easier:
(in-package :org.tfeb.clc-user)
(defpackage :cl/generic-arith
;; The package people will use
(:use)
(:extends/excluding :cl
#:< #:<= #:= #:= #:> #:+ #:- #:/ #:*)
(:export
#:< #:<= #:= #:= #:> #:+ #:- #:/ #:*))
(defpackage :cl/generic-arith/user
;; User package
(:use :cl/generic-arith))
(defpackage :cl/generic-arith/impl
;; Implementation package
(:use :cl/generic-arith))
So now:
> (in-package :cl/generic-arith/user)
#<The CL/GENERIC-ARITH/USER package, 0/16 internal, 0/16 external>
> (describe '*)
* is a symbol
name "*"
value #<unbound value>
function #<unbound function>
plist nil
package #<The CL/GENERIC-ARITH package, 0/1024 internal, 978/1024 external>
So now we can get on with defining these things. There are some problems: because what we've done is create an environment where, for instance * is not cl:*, then anything which uses * as a symbol will not work. So for instance cl:* is bound the the value of the last form evaluated in the repl, and typing * isn't going to get that symbol: you'll need to type cl:*.
Similarly the + method combination will not work: you'd have to explicitly say
(defgeneric foo (...)
...
(:method-combination cl:+)
...)
So there is some pain. But not, probably, that much pain.
And now we can start implementing things. The obvious approach is to define generic functions which are 2-arg versions of the functions we need, and then define the functions on top of them. That means we can use things like compiler macros for the actual operators without fear of breaking the underlying generic functions.
So start by
(in-package :cl/generic-arith/impl)
(defgeneric +/2 (a b)
(:method ((a number) (b number))
(cl:+ a b)))
...
And now you think for a bit and realise that ...
(3) ... but there are problems you can't get around, in any language
The problem is at least this: + is one of the two operators on a field, and the way it is defined in CL in terms of some two-argument function like +/2 above is something like this (this may be wrong implementationally but it's how the maths would go):
(+ a) is a;
(+ a b) is (+/2 a b);
(+ a b ...) is (+/2 a (+ b ...)` and this is fine by associativity (but beware of floats).
And that looks fine but we've missed a case:
(+) is the additive identity of the field (and similarly (*) is the multiplicative identity).
Oh, wait: what field? For cl:+ this is fine: it's the identities of the field of numbers. But now? Who knows? And there's nothing that can be done about that: if you want + to work the way it does in CL, except on other fields, you can't have that.
In general if + and * are to represent the two operations of a field, then if zero-argument versions of those operators are allowed those should return the two identities of the field (which they do in CL). But that means that all fields they are defined for must share those two identities (which, again, they do in CL: the fields of rationals & complex numbers share identities). This means in turn that (+ x 3) (from (+ x (*) (*) (*))) must be allowed (and so on). Or you either disallow the zero-argument case, or that they represent the field operations, or both.

You can by using generic-cl, that does that and that is extensible.
(= "hello" y) ;; would throw an error in standard CL
You would :use :generic-cl instead of :cl in a package definition, and you would use :generic-cl-user at the REPL.
Subclass the methods equalp and lessp.

Related

Why aren't lisp macros eagerly expanded by default?

Say I have macros foo and bar. If I write (foo (bar)) my understanding is that in most (all?) lisps foo is going to be given '(bar), not whatever bar would have expanded to had it been expanded first. Some lisps have something like local-expand where the implementation of foo can explicitly request the expansion of its argument before continuing, but why isn't that the default? It seems more natural to me. Is this an accident of history or is there a strong reason to do it the way most lisps do it?
I've been noticing in Rust that I want the macros to work this way. I'd like to be able to wrap a bunch of declarations inside a macro call so that the macro can then crawl the declarations and generate reflection information. But if I use a macro to generate the definitions that I want crawled, my macro wrapping the declarations sees the macro invocations that would generate the declarations rather than the actual declarations.
If I write (foo (bar)) my understanding is that in most (all?) lisps foo is going to be given '(bar), not whatever bar would have expanded to had it been expanded first.
That would restrict Lisp such that (bar) would need to be something that can be expanded -> probably something which is written in the Lisp language.
Lisp developers would like to see macros where the inner stuff can be a completely new language with different syntax rules. For example something, where FOO does not expand it's subforms, but transpiles/compiles a fully/partially different language to Lisp. Something which does not have the usual prefix expression syntax:
Examples
(postfix (a b +) sin)
-> (sin (+ a b))
Here the + in the macro form is not the infix +.
or
(query-all (person name)
where (person = "foo") in database DB)
Lisp macros don't work on language parse trees, but arbitrary, possibly nested, s-expressions. Those don't need to be valid Lisp code outside of that macro -> don't need to follow the usual syntax / semantics.
Common Lisp has the function MACROEXPAND and MACROEXPAND-1, such that the outer macro can expand inner code if it wants to do it during its own macro expansion:
CL-USER 26 > (defmacro bar (a) `(* ,a ,a))
BAR
CL-USER 27 > (bar 10)
100
CL-USER 28 > (defmacro foo (a &environment e)
(let ((f (macroexpand a e)))
(print (list a '-> f))
`(+ ,(second f) ,(third f))))
FOO
CL-USER 29 > (foo (bar 10))
((bar 10) -> (* 10 10))
20
In above macro, if FOO would see only an expanded form, it could not print both the source and the expansion.
This works also with scoped macros. Here the BAR macro gets locally redefined and the MACROEXPAND generates different code inside FOO for the same form:
CL-USER 30 > (macrolet ((bar (a)
`(expt ,a ,a)))
(foo (bar 10)))
((bar 10) -> (EXPT 10 10))
20
If foo is a macro then (foo (bar)) must pass the raw syntax (bar) to the foo macro expander. This is absolutely essential.
This is because foo can give any meaning whatsoever to bar.
Consider the defmacro macro itself:
(defmacro foo (bar) body)
Here, the argument (bar) is a parameter list ("macro lambda list") and not a form (Common Lisp jargon for to-be-evaluated expression). It says that the macro shall have a single parameter called bar. Therefore it is nonsensically wrong to try to expand (bar) before handing it to defmacro's expander.
Only if we know that an expression is going to be evaluated is it legitimate to expand it as a macro. But we don't know that about an expression which is the argument to a macro.
Other counterexamples are easy to come up with. (defstruct point (x 0) (y 0)): (x 0) isn't a call to operator x, but a slot x whose default value is 0. (dolist (x list) ...): x is a variable to be stepped over list.
That said, there are implementation choices regarding the timing of macro expansion.
A Lisp implementation can macro-expand an entire top-level form before evaluating or compiling any of it. Or it can expand incrementally, so that for instance when (+ x y) is being processed, x is macro-expanded and evaluated or compiled into some intermediate form already before y is even looked at.
A pure syntax tree interpreter for Lisp which always keeps the code in the original form and always expands (and re-expands) the code as it is evaluating has certain interactivity advantages. Any macro that you rewrite goes instantly "live" in all the existing code that you have input into the REPL, like existing function definitions. It is obviously quite inefficient in terms of execution speed, but any code that you call uses the latest definition of your macros without any hassle of telling the system to reload that code to have it expanded again. That also eliminates the risk that you're testing something that is still based on the old, buggy version of some macro that you fixed. If you're ever writing a Lisp, the range of timing choices for expansion is good to keep in mind, so that you consciously reject the choices you don't go with.
In turn, that said, there are some constraints on the timing of macro expansion. Conceivably, a Lisp interpreter or compiler, when processing an entire file, could go through all the top level forms and expand all of them at once before processing any of them. The implementor will quickly learn that this is bad, because some of the later forms depend on the side effects of the earlier forms. Such as, oh, macros being defined! If the first form defines a macro, which the second one uses, then we cannot expand the second form without evaluating the effect of the first.
It makes sense, in a Lisp, to split up physical top-level forms into logical ones. Suppose that someone writes (or uses a macro to generate) codde like (progn (defmacro foo ...) (foo)). This entire progn cannot be macro expanded up-front before evaluation; it won't work! There has to be a rule such as "whenever a top-level form is based on the progn operator, then the children of the progn operator are considered top-level forms by all the processing which treats top-level forms specially, and this rule is recursively applied." The top-level entry point into the macro-expanding code walker then has to contain special case hacks to do this recognition of logical top-level forms, breaking them up and recursing into a lower level expander which doesn't do those checks any more.
I've been noticing in Rust that I want the macros to work this way.
I'd like to be able to wrap a bunch of declarations inside a macro
call so that the macro can then crawl the declarations and generate
reflection information.
It does sound like local-expand is the right tool for that job.
However, an alternative approach would be something like this:
Suppose that wrapper is our outer macro, and that the intended
syntax is:
(wrapper decl1 decl2 ...)
where decl is a declaration that potenteally uses some standard form declare.
We can let
(wrapper decl1 decl2 ...)
expand to
(let-syntax ([declare our-declare])
decl1 decl2 ...
(post-process-reflection-information))
where our-declare is a helper macro that expands both to the standard declaration as well as some form that stores the reflection information,
also post-process-reflection-information is another macro that
does any needed post processing.
I think you are trying to use macros for something they are not designed to solve. Macros are primarily a text/code substitution mechanism, and in the case of Lisp this looks a lot like a simplified term-rewriting system (see also How does term-rewriting based evaluation work?). There are different strategies possible for how to substitute a pattern of code, and in which order, but in C/C++ preprocessor macros, in LaTeX, and in Lisp, the process is typically done by computing the expansion until the form is no longer expandable, starting from the topmost terms. This order is quite natural and because it is distinct from normal evaluation rules, it can be used to implement things the normal evaluation rules cannot.
In your case, you are interested in getting access to all the declarations of some object/type, something which falls under the introspection/reflection category (as you said yourself). But implementing reflection/introspection with macros doesn't look totally doable, since macros work on abstract syntax trees and this might be a poor way to access the metadata you want.
Typically the compiler is going to parse/analyze the struct definitions and build the definitive, canonical representation of the struct, even if there are different way to express that syntactically; it may even use prior information not available directly as source code to compute more interesting metadata (e.g. if you had inheritance, there could be a set of properties inherited from a type defined in another module (I don't think this applies to Rust)).
I think currently Rust does not offer compile-time or runtime introspection facilities, which explains why are you going with the macro route. In Common Lisp macros are definitely not used for introspection, the actual values obtained after evaluation (at different times) is used to gain information about an object. For example, defclass expands as a set of instructions that register a class in the language, but in order to get all the slots of a class, you ask the language to give it to you, e.g:
(defclass foo () (x)) ;; define class foo with slot X
(defclass bar () (y)) ;; define class bar with slot Y
(defclass zot (foo bar) ()) ;; define class zot with foo and bar as superclasses
USER> (c2mop:class-slots (find-class 'zot))
(#<SB-MOP:STANDARD-EFFECTIVE-SLOT-DEFINITION X>
#<SB-MOP:STANDARD-EFFECTIVE-SLOT-DEFINITION Y>)
I don't know what the solution for your problem is, but in addition to the other answers, I think it is not specifically a fault of the macro system. If a macro is defined as done usually as only a term rewriting system, it will always have difficulties to perform some tasks on the semantic level. But Rust is still evolving so there might be better ways to do things in the future.

Is it possible to define a macro with the same properties as a begin form?

More specifically what I am interested in is the one particular property of begin where definitions in its body added to the surrounding scope. E.g.
(begin (define a 1)
(define b 2))
(+ a b) ; 3
It would be reasonably simple to define a new macro my-begin that is translated to a standard begin, but in my particular use case I need a list of all identifiers that are being bound in the begin, such that I can introduce/use them elsewhere.
Being more specific as to what I want to achieve: I am figuring out how to build a PAR/AND operator that evaluates two branches at the same time (or at least give the impression, given the context). The branches may contain blocking operations. The PAR/AND itself returns once both branches return. Any definitions become available in the outlying scope. For example:
(PAR/AND
(define a (do-something)) ; branch 1
(define b (do-something-else))) ; branch 2
(+ a b)
I'm not quite certain yet how to implement it (since I have some extra stuff to worry about), but making a and b available outside the scope of par/and is definitely something that needs to happen at some point.
I would have skipped the part about define totally and just made sure the expression are done in parallel and then when the threads are joined use values. In racket you have define-values where you can bind multiple values to global names and it can be used to name the values you have computed:
(define-values (a b)
(parallel/values (do-something) (do-something-else)))
(+ a b)
; ==> 1337
paralell/values need to be syntax here, but since the expressions themselves are thunks you could in fact keep is a procedure that takes thunks:
(define-values (a b)
(parallel/values do-something do-something-else))
(+ a b)
; ==> 1337
As you see this does not require a customized begin at all. I expect the expressions in parallel would introduce closures that will affect define so this circumvent that can of worms as well. define-values is the actual define primitive in racket if you look at full expanded programs from the macro expander with macro hiding disabled.
There are complicated ways to do this, but it looks like there's an easy way as well: can't you just have your PAR/AND macro expand into a begin ?

Why the function/macro dichotomy?

Why is the function/macro dichotomy present in Common Lisp?
What are the logical problems in allowing the same name representing both a macro (taking precedence when found in function position in compile/eval) and a function (usable for example with mapcar)?
For example having second defined both as a macro and as a function would allow to use
(setf (second x) 42)
and
(mapcar #'second L)
without having to create any setf trickery.
Of course it's clear that macros can do more than functions and so the analogy cannot be complete (and I don't think of course that every macro shold also be a function) but why forbidding it by making both sharing a single namespace when it could be potentially useful?
I hope I'm not offending anyone, but I don't really find a "Why doing that?" response really pertinent... I'm looking for why this is a bad idea. Imposing an arbitrary limitation because no good use is known is IMO somewhat arrogant (sort of assumes perfect foresight).
Or are there practical problems in allowing it?
Macros and Functions are two very different things:
macros are using source (!!!) code and are generating new source (!!!) code
functions are parameterized blocks of code.
Now we can look at this from several angles, for example:
a) how do we design a language where functions and macros are clearly identifiable and are looking different in our source code, so we (the human) can easily see what is what?
or
b) how do we blend macros and functions in a way that the result is most useful and has the most useful rules controlling its behavior? For the user it should not make a difference to use a macro or a function.
We really need to convince ourselves that b) is the way to go and we would like to use a language where macros and functions usage looks the same and is working according to similar principles. Take ships and cars. They look different, their use case is mostly different, they transport people - should we now make sure that the traffic rules for them are mostly identical, should we make them different or should we design the rules for their special usage?
For functions we have problems like: defining a function, scope of functions, life-time of functions, passing functions around, returning functions, calling functions, shadowing of functions, extension of functions, removing the definition a function, compilation and interpretation of functions, ...
If we would make macros appear mostly similar to functions, we need to address most or all above issues for them.
In your example you mention a SETF form. SETF is a macro that analyses the enclosed form at macro expansion time and generates code for a setter. It has little to do with SECOND being a macro or not. Having SECOND being a macro would not help at all in this situation.
So, what is a problem example?
(defmacro foo (a b)
(if (and (numberp b) (zerop b))
a
`(- ,a ,b)))
(defun bar (x list)
(mapcar #'foo (list x x x x) '(1 2 3 4)))
Now what should that do? Intuitively it looks easy: map FOO over the lists. But it isn't. When Common Lisp was designed, I would guess, it was not clear what that should do and how it should work. If FOO is a function, then it was clear: Common Lisp took the ideas from Scheme behind lexically scoped first-class functions and integrated it into the language.
But first-class macros? After the design of Common Lisp a bunch of research went into this problem and investigated it. But at the time of Common Lisp's design, there was no wide-spread use of first-class macros and no experience with design approaches. Common Lisp is standardizing on what was known at the time and what the language users thought necessary to develop (the object-system CLOS is kind of novel, based on earlier experience with similar object-systems) software with. Common Lisp was not designed to have the theoretically most pleasing Lisp dialect - it was designed to have a powerful Lisp which allows the efficient implementation of software.
We could work around this and say, passing macros is not possible. The developer would have to provide a function under the same name, which we pass around.
But then (funcall #'foo 1 2) and (foo 1 2) would invoke different machineries? In the first case the function fooand in the second case we use the macro foo to generate code for us? Really? Do we (as human programmers) want this? I think not - it looks like it makes programming much more complicated.
From a pragmatic point of view: Macros and the mechanism behind it are already complicated enough that most programmers have difficulties dealing with it in real code. They make debugging and code understanding much harder for a human. On the surface a macro makes code easier to read, but the price is the need to understand the code expansion process and result.
Finding a way to further integrate macros into the language design is not an easy task.
readscheme.org has some pointers to Macro-related research wrt. Scheme: Macros
What about Common Lisp
Common Lisp provides functions which can be first-class (stored, passed around, ...) and lexically scoped naming for them (DEFUN, FLET, LABELS, FUNCTION, LAMBDA).
Common Lisp provides global macros (DEFMACRO) and local macros (MACROLET).
Common Lisp provides global compiler macros (DEFINE-COMPILER-MACRO).
With compiler macros it is possible to have a function or macro for a symbol AND a compiler macro. The Lisp system can decide to prefer the compiler macro over the macro or function. It can also ignore them entirely. This mechanism is mostly used for the user to program specific optimizations. Thus it does not solve any macro related problems, but provides a pragmatic way to program global optimizations.
I think that Common Lisp's two namespaces (functions and values), rather than three (macros, functions, and values), is a historical contingency.
Early Lisps (in the 1960s) represented functions and values in different ways: values as bindings on the runtime stack, and functions as properties attached to symbols in the symbol table. This difference in implementation led to the specification of two namespaces when Common Lisp was standardized in the 1980s. See Richard Gabriel's paper Technical Issues of Separation in Function Cells and Value Cells for an explanation of this decision.
Macros (and their ancestors, FEXPRs, functions which do not evaluate their arguments) were stored in many Lisp implementations in the symbol table, in the same way as functions. It would have been inconvenient for these implementations if a third namespace (for macros) had been specified, and would have caused backwards-compatibility problems for many programs.
See Kent Pitman's paper Special Forms in Lisp for more about the history of FEXPRs, macros and other special forms.
(Note: Kent Pitman's website is not working for me, so I've linked to the papers via archive.org.)
Because then the exact same name would represent two different objects, depending on the context. It makes the programme unnecessarily difficult to understand.
My TXR Lisp dialect allows a symbol to be simultaneously a macro and function. Moreover, certain special operators are also backed by functions.
I put a bit of thought into the design, and haven't run into any problems. It works very well and is conceptually clean.
Common Lisp is the way it is for historic reasons.
Here is a brief rundown of the system:
When a global macro is defined for symbol X with defmacro, the symbol X does not become fboundp. Rather, what becomes fboundp is the compound function name (macro X).
The name (macro X) is then known to symbol-function, trace and in other situations. (symbol-function '(macro X)) retrieves the two-argument expander function which takes the form and an environment.
It's possible to write a macro using (defun (macro X) (form env) ...).
There are no compiler macros; regular macros do the job of compiler macros.
A regular macro can return the unexpanded form to indicate that it's declining to expand. If a lexical macrolet declines to expand, the opportunity goes to a more lexically outer macrolet, and so on up to the global defmacro. If the global defmacro declines to expand, the form is considered expanded, and thus is necessarily either a function call or special form.
If we have both a function and macro called X, we can call the function definition using (call (fun X) ...) or (call 'X ...), or else using the Lisp-1-style dwim evaluator (dwim X ...) that is almost always used through its [] syntactic sugar as [X ...].
For a sort of completeness, the functions mboundp, mmakunbound and symbol-macro are provided, which are macro analogs of fboundp, fmakunbound and symbol-function.
The special operators or, and, if and some others have function definitions also, so that code like [mapcar or '(nil 2 t) '(1 0 3)] -> (1 2 t) is possible.
Example: apply constant folding to sqrt:
1> (sqrt 4.0)
2.0
2> (defmacro sqrt (x :env e :form f)
(if (constantp x e)
(sqrt x)
f))
** warning: (expr-2:1) defmacro: defining sqrt, which is also a built-in defun
sqrt
3> (sqrt 4.0)
2.0
4> (macroexpand '(sqrt 4.0))
2.0
5> (macroexpand '(sqrt x))
(sqrt x)
However, no, (set (second x) 42) is not implemented via a macro definition for second. That would not work very well. The main reason is that it would be too much of a burden. The programmer may want to have, for a given function, a macro definition which has nothing to do with implementing assignment semantics!
Moreover, if (second x) implements place semantics, what happens when it is not embedded in an assignment operation, such that the semantics is not required at all? Basically, to hit all the requirements would require concocting a scheme for writing macros whose complexity would equal or exceed that of existing logic for handling places.
TXR Lisp does, in fact, feature a special kind of macro called a "place macro". A form is only recognized as a place macro invocation when it is used as a place. However, place macros do not implement place semantics themselves; they just do a straightforward rewrite. Place macros must expand down to a form that is recognized as a place.
Example: specify that (foo x), when used as a place, behaves as (car x):
1> (define-place-macro foo (x) ^(car ,x))
foo
2> (macroexpand '(foo a)) ;; not a macro!
(foo a)
3> (macroexpand '(set (foo a) 42)) ;; just a place macro
(sys:rplaca a 42)
If foo expanded to something which is not a place, things would fail:
4> (define-place-macro foo (x) ^(bar ,x))
foo
5> (macroexpand '(foo a))
(foo a)
6> (macroexpand '(set (foo a) 42))
** (bar a) is not an assignable place

Examples of what Lisp's macros can be used for

I've heard that Lisp's macro system is very powerful. However, I find it difficult to find some practical examples of what they can be used for; things that would be difficult to achieve without them.
Can anyone give some examples?
Source code transformations. All kinds. Examples:
New control flow statements: You need a WHILE statement? Your language doesn't have one? Why wait for the benevolent dictator to maybe add one next year. Write it yourself. In five minutes.
Shorter code: You need twenty class declarations that almost look identical - only a limited amount of places are different. Write a macro form that takes the differences as parameter and generates the source code for you. Want to change it later? Change the macro in one place.
Replacements in the source tree: You want to add code into the source tree? A variable really should be a function call? Wrap a macro around the code that 'walks' the source and changes the places where it finds the variable.
Postfix syntax: You want to write your code in postfix form? Use a macro that rewrites the code to the normal form (prefix in Lisp).
Compile-time effects: You need to run some code in the compiler environment to inform the development environment about definitions? Macros can generate code that runs at compile time.
Code simplifications/optimizations at compile-time: You want to simplify some code at compile time? Use a macro that does the simplification - that way you can shift work from runtime to compile time, based on the source forms.
Code generation from descriptions/configurations: You need to write a complex mix of classes. For example your window has a class, subpanes have classes, there are space constraints between panes, you have a command loop, a menu and a whole bunch of other things. Write a macro that captures the description of your window and its components and creates the classes and the commands that drive the application - from the description.
Syntax improvements: Some language syntax looks not very convenient? Write a macro that makes it more convenient for you, the application writer.
Domain specific languages: You need a language that is nearer to the domain of your application? Create the necessary language forms with a bunch of macros.
Meta-linguistic abstraction
The basic idea: everything that is on the linguistic level (new forms, new syntax, form transformations, simplification, IDE support, ...) can now be programmed by the developer piece by piece - no separate macro processing stage.
Pick any "code generation tool". Read their examples. That's what it can do.
Except you don't need to use a different programming language, put any macro-expansion code where the macro is used, run a separate command to build, or have extra text files sitting on your hard disk that are only of value to your compiler.
For example, I believe reading the Cog example should be enough to make any Lisp programmer cry.
Anything you'd normally want to have done in a pre-processor?
One macro I wrote, is for defining state-machines for driving game objects. It's easier to read the code (using the macro) than it is to read the generated code:
(def-ai ray-ai
(ground
(let* ((o (object))
(r (range o)))
(loop for p in *players*
if (line-of-sight-p o p r)
do (progn
(setf (target o) p)
(transit seek)))))
(seek
(let* ((o (object))
(target (target o))
(r (range o))
(losp (line-of-sight-p o target r)))
(when losp
(let ((dir (find-direction o target)))
(setf (movement o) (object-speed o dir))))
(unless losp
(transit ground)))))
Than it is to read:
(progn
(defclass ray-ai (ai) nil (:default-initargs :current 'ground))
(defmethod gen-act ((ai ray-ai) (state (eql 'ground)))
(macrolet ((transit (state)
(list 'setf (list 'current 'ai) (list 'quote state))))
(flet ((object ()
(object ai)))
(let* ((o (object)) (r (range o)))
(loop for p in *players*
if (line-of-sight-p o p r)
do (progn (setf (target o) p) (transit seek)))))))
(defmethod gen-act ((ai ray-ai) (state (eql 'seek)))
(macrolet ((transit (state)
(list 'setf (list 'current 'ai) (list 'quote state))))
(flet ((object ()
(object ai)))
(let* ((o (object))
(target (target o))
(r (range o))
(losp (line-of-sight-p o target r)))
(when losp
(let ((dir (find-direction o target)))
(setf (movement o) (object-speed o dir))))
(unless losp (transit ground)))))))
By encapsulating the whole state-machine generation in a macro, I can also ensure that I only refer to defined states and warn if that is not the case.
With macros you can define your own syntax, thus you extend Lisp and make it
suited for the programs you write.
Check out the, very good, online book Practical Common Lisp, for practical examples.
7. Macros: Standard Control Constructs
8. Macros: Defining Your Own
Besides extending the language's syntax to allow you to express yourself more clearly, it also gives you control over evaluation. Try writing your own if in your language of choice so that you can actually write my_if something my_then print "success" my_else print "failure" and not have both print statements get evaluated. In any strict language without a sufficiently powerful macro system, this is impossible. No Common Lisp programmers would find the task too challenging, though. Ditto for for-loops, foreach loops, etc. You can't express these things in C because they require special evaluation semantics (people actually tried to introduce foreach into Objective-C, but it didn't work well), but they are almost trivial in Common Lisp because of its macros.
R, the standard statistics programming language, has macros (R manual, chapter 6). You can use this to implement the function lm(), which analyzes data based on a model that you specify as code.
Here's how it works: lm(Y ~ aX + b, data) will try to find a and b parameters that best fit your data. The cool part is, you can substitute any linear equation for aX + b and it will still work. It's a brilliant feature to make statistics computation easier, and it only works so elegantly because lm() can analyze the equation it's given, which is exactly what Lisp macros do.
Just a guess -- Domain Specific Languages.
Macros are essential in providing access to language features. For instance, in TXR Lisp, I have a single function called sys:capture-cont for capturing a delimited continuation. But this is awkward to use by itself. So there are macros wrapped around it, such as suspend, or obtain and yield which provide alternative models for resumable, suspended execution. They are implemented here.
Another example is the complex macro defstruct which provides syntax for defining a structure type. It compiles its arguments into lambda-s and other material which is passed to the function make-struct-type. If programs used make-struct-type directly for defining OOP structures, they would be ugly:
1> (macroexpand '(defstruct foo bar x y (z 9) (:init (self) (setf self.x 42))))
(sys:make-struct-type 'foo 'bar '()
'(x y z) ()
(lambda (#:g0101)
(let ((#:g0102 (struct-type #:g0101)))
(unless (static-slot-p #:g0102 'z)
(slotset #:g0101 'z
9)))
(let ((self #:g0101))
(setf (qref self x)
42)))
())
Yikes! There is a lot going on that has to be right. For instance, we don't just stick a 9 into slot z because (due to inheritance) we could actually be the base structure of a derived structure, and in the derived structure, z could be a static slot (shared by instances). We would be clobbering the value set up for z in the derived class.
In ANSI Common Lisp, a nice example of a macro is loop, which provides an entire sub-language for parallel iteration. A single loop invocation can express an entire complicated algorithm.
Macros let us think independently about the syntax we would like in a language feature, and the underlying functions or special operators required to implement it. Whatever choices we make in these two, macros will bridge them for us. I don't have to worry that make-struct is ugly to use, so I can focus on the technical aspects; I know that the macro can look the same regardless of how I make various trade-offs. I made the design decision that all struct initialization is going to be done by some functions registered to the type. Okay, that means that my macro has to take all the initializations in the slot-defining syntax, and compile the anonymous functions, where the slot initialization is done by code generated in the bodies.
Macros are compilers for bits of syntax, for which functions and special operators are the target language.
Sometimes people (non-Lisp people, usually) criticize macros in this way: macros don't add any capabilities, only syntactic sugar.
Firstly, syntactic sugar is a capability.
Secondly, you also have to consider macros from a "total hacker perspective": combining macros with implementation-level work. If I'm adding features to a Lisp dialect, such as structures or continuations, I am actually extending the power. The involvement of macros in that enterprise is essential. Even though macros aren't the source of the new power (it doesn't emanate from the macros themselves), they help tame and harness it, giving it expression.
If you don't have sys:capture-cont, you can't just hack up its behavior with a suspend macro. But if you don't have macros, then you have to do something awfully inconvenient to provide access to a new feature that isn't a library function, namely hard-coding some new phrase structure rules into a parser.

Can I use Common Lisp for SICP or is Scheme the only option?

Also, even if I can use Common Lisp, should I? Is Scheme better?
You have several answers here, but none is really comprehensive (and I'm not talking about having enough details or being long enough). First of all, the bottom line: you should not use Common Lisp if you want to have a good experience with SICP.
If you don't know much Common Lisp, then just take it as that. (Obviously you can disregard this advice as anything else, some people only learn the hard way.)
If you already know Common Lisp, then you might pull it off, but at considerable effort, and at a considerable damage to your overall learning experience. There are some fundamental issues that separate Common Lisp and Scheme, which make trying to use the former with SICP a pretty bad idea. In fact, if you have the knowledge level to make it work, then you're likely above the level of SICP anyway. I'm not saying that it's not possible -- it is of course possible to implement the whole book in Common Lisp (for example, see Bendersky's pages) just as you can do so in C or Perl or whatever. It's just going to harder with languages that are further apart from Scheme. (For example, ML is likely to be easier to use than Common Lisp, even when its syntax is very different.)
Here are some of these major issues, in increasing order of importance. (I'm not saying that this list is exhaustive in any way, I'm sure that there are a whole bunch of additional issues that I'm omitting here.)
NIL and related issues, and different names.
Dynamic scope.
Tail call optimization.
Separate namespace for functions and values.
I'll expand now on each of these points:
The first point is the most technical. In Common Lisp, NIL is used both as the empty list and as the false value. In itself, this is not a big issue, and in fact the first edition of SICP had a similar assumption -- where the empty list and false were the same value. However, Common Lisp's NIL is still different: it is also a symbol. So, in Scheme you have a clear separation: something is either a list, or one of the primitive types of values -- but in Common Lisp, NIL is not only false and the empty list: it is also a symbol. In addition to this, you get a host of slightly different behavior -- for example, in Common Lisp the head and the tail (the car and cdr) of the empty list is itself the empty list, while in Scheme you'll get a runtime error if you try that. To top it off, you have different names and naming convention, for example -- predicates in Common Lisp end by convention with P (eg, listp) while predicates in Scheme end in a question mark (eg, list?); mutators in Common Lisp have no specific convention (some have an N prefix), while in Scheme they almost always have a suffix of !. Also, plain assignment in Common Lisp is usually setf and it can operate on combinations too (eg, (setf (car foo) 1)), while in Scheme it is set! and limited to setting bound variables only. (Note that Common Lisp has the limited version too, it's called setq. Almost nobody uses it though.)
The second point is a much deeper one, and possibly one that will lead to completely incomprehensible behavior of your code. The thing is that in Common Lisp, function arguments are lexically scoped, but variables that are declared with defvar are dynamically scoped. There is a whole range of solutions that rely on lexically scoped bindings -- and in Common Lisp they just won't work. Of course, the fact that Common Lisp has lexical scope means that you can get around this by being very careful about new bindings, and possibly using macros to get around the default dynamic scope -- but again, this requires a much more extensive knowledge than a typical newbie has. Things get even worse than that: if you declare a specific name with a defvar, then that name will be bound dynamically even if they're arguments to functions. This can lead to some extremely difficult to track bugs which manifest themselves in an extremely confusing way (you basically get the wrong value, and you'll have no clue why that happens). Experienced Common Lispers know about it (especially those that have been burnt by it), and will always follow the convention of using stars around dynamically scoped names (eg, *foo*). (And by the way, in Common Lisp jargon, these dynamically scoped variables are called just "special variables" -- which is another source of confusion for newbies.)
The third point was also discussed in some of the previous comments. In fact, Rainer had a pretty good summary of the different options that you have, but he didn't explain just how hard it can make things. The thing is that proper tail-call-optimization (TCO) is one of the fundamental concepts in Scheme. It is important enough that it is a language feature rather than merely an optimization. A typical loop in Scheme is expressed as a tail-calling function (for example, (define (loop) (loop))) and proper Scheme implementations are required to implement TCO which will guarantee that this is, in fact, an infinite loop rather than running for a short while until you blow up the stack space. This is all the essence of Rainer's first non solution, and the reason he labeled it as "BAD".
His third option -- rewriting functional loops (expressed as recursive functions) as Common Lisp loops (dotimes, dolist, and the infamous loop) can work for a few simple cases, but at a very high cost: the fact that Scheme is a language that does proper TCO is not only fundamental to the language -- it is also one of the major themes in the book, so by doing so, you will have lost that point completely. In addition, there are some cases that you just cannot translate Scheme code into a Common Lisp loop construct -- for example, as you work your way through the book, you'll get to implement a meta-circular-interpreter which is an implementation of a mini-Scheme language. It takes a certain click to realize that this meta evaluator implements a language that is itself doing TCO if the language that you implement this evaluator in is itself doing TCO. (Note that I'm talking about the "simple" interpreters -- later in the book you implement this evaluator as something close to a register machine, where you kind of explicitly make it do TCO.) The bottom line to all of this, is that this evaluator -- when implemented in Common Lisp -- will result in a language that is itself not doing TCO. People who are familiar with all of this should not be surprised: after all, the "circularity" of the evaluator means that you're implementing a language with semantics that are very close to the host language -- so in this case you "inherit" the Common Lisp semantics rather than the Scheme TCO semantics. However, this means that your mini-evaluator is now crippled: it has no TCO, so it has no way of doing loops! To get loops in, you will need to implement new constructs in your interpreter, which will usually use the iteration constructs in Common Lisp. But now you're going further away from what's in the book, and you're investing considerable effort in approximately implementing the ideas in SICP to the different language. Note also that all of this is related to the previous point I raised: if you follow the book, then the language that you implement will be lexically scoped, taking it further away from the Common Lisp host language. So overall, you completely lose the "circular" property in what the book calls "meta circular evaluator". (Again, this is something that might not bother you, but it will damage the overall learning experience.) All in all, very few languages get close to Scheme in being able to implement the semantics of the language inside the language as a non-trivial (eg, not using eval) evaluator that easily.
In fact, if you do go with a Common Lisp, then in my opinion, Rainer's second suggestion -- use a Common Lisp implementation that supports TCO -- is the best way to go. However, in Common Lisp this is fundamentally a compiler optimization: so you will likely need to (a) know about the knobs in the implementation that you need to turn to make TCO happen, (b) you will need to make sure that the Common Lisp implementation is actually doing proper TCO, and not just optimization of self calls (which is the much simpler case that is not nearly as important), (c) you would hope that the Common Lisp implementation that does TCO can do so without damaging debugging options (again, since this is considered an optimization in Common Lisp, then turning this knob on, might also be taken by the compiler as saying "I don't care much for debuggability").
Finally, my last point is not too hard to overcome, but it is conceptually the most important one. In Scheme, you have a uniform rule: identifiers have a value, which is determined lexically -- and that's it. It's a very simple language. In Common Lisp, in addition to the historical baggage of sometimes using dynamic scope and sometimes using lexical scope, you have symbols that have two different value -- there's the function value that is used whenever a variable appears at the head of an expression, and there is a different value that is used otherwise. For example, in (foo foo), each of the two instances of foo are interpreted differently -- the first is the function value of foo and the second is its variable value. Again, this is not hard to overcome -- there are a number of constructs that you need to know about to deal with all of this. For example, instead of writing (lambda (x) (x x)) you need to write (lambda (x) (funcall x x)), which makes the function that is being called appear in a variable position, therefore the same value will be used there; another example is (map car something) which you will need to translate to (map #'car something) (or more accurately, you will need to use mapcar which is Common Lisp's equivalent of the car function); yet another thing that you'll need to know is that let binds the value slot of the name, and labels binds the function slot (and has a very different syntax, just like defun and defvar.)
But the conceptual result of all of this is that Common Lispers tend to use higher-order code much less than Schemers, and that goes all the way from the idioms that are common in each language, to what implementations will do with it. (For example, many Common Lisp compilers will never optimize this call: (funcall foo bar), while Scheme compilers will optimize (foo bar) like any function call expression, because there is no other way to call functions.)
Finally, I'll note that much of the above is very good flamewar material: throw any of these issues into a public Lisp or Scheme forum (in particular comp.lang.lisp and comp.lang.scheme), and you'll most likely see a long thread where people explain why their choice is far better than the other, or why some "so called feature" is actually an idiotic decision that was made by language designers that were clearly very drunk at the time, etc etc. But the thing is that these are just differences between the two languages, and eventually people can get their job done in either one. It just happens that if the job is "doing SICP" then Scheme will be much easier considering how it hits each of these issues from the Scheme perspective. If you want to learn Common Lisp, then going with a Common Lisp textbook will leave you much less frustrated.
Using SICP with Common Lisp is possible and fun
You can use Common Lisp for learning with SICP without much problems. The Scheme subset that is used in the book is not very sophisticated. SICP does not use macros and it uses no continuations. There are DELAY and FORCE, which can be written in Common Lisp in a few lines.
Also for a beginner using (function foo) and (funcall foo 1 2 3) is actually better (IMHO !), because the code gets clearer when learning the functional programming parts. You can see where variables and lambda functions are being called/passed.
Tail call optimization in Common Lisp
There is only one big area where using Common Lisp has a drawback: tail call optimization (TCO). Common Lisp does not support TCO in its standard (because of unclear interaction with the rest of the language, not all computer architectures support it directly (think JVM), not all compilers support it (some Lisp Machine), it makes some debugging/tracing/stepping harder, ...).
There are three ways to live with that:
Hope that the stack does not blow out. BAD.
Use a Common Lisp implementation that supports TCO. There are some. See below.
Rewrite the functional loops (and similar constructs) into loops (and similar constructs) using DOTIMES, DO, LOOP, ...
Personally I would recommend 2 or 3.
Common Lisp has excellent and easy to use compilers with TCO support (SBCL, LispWorks, Allegro CL, Clozure CL, ...) and as a development environment use either the built-in ones or GNU Emacs/SLIME.
For use with SICP I would recommend SBCL, since it compiles always by default, has TCO support by default and the compiler catches a lot of coding problems (undeclared variables, wrong argument lists, a bunch of type errors, ...). This helps a lot during learning. Generally make sure the code is compiled, since Common Lisp interpreters will usually not support TCO.
Sometimes it might also helpful to write one or two macros and provide some Scheme function names to make code look a bit more like Scheme. For example you could have a DEFINE macro in Common Lisp.
For the more advanced users, there is an old Scheme implementation written in Common Lisp (called Pseudo Scheme), that should run most of the code in SICP.
My recommendation: if you want to go the extra mile and use Common Lisp, do it.
To make it easier to understand the necessary changes, I've added a few examples - remember, it needs a Common Lisp compiler with support for tail call optimization:
Example
Let's look at this simple code from SICP:
(define (factorial n)
(fact-iter 1 1 n))
(define (fact-iter product counter max-count)
(if (> counter max-count)
product
(fact-iter (* counter product)
(+ counter 1)
max-count)))
We can use it directly in Common Lisp with a DEFINE macro:
(defmacro define ((name &rest args) &body body)
`(defun ,name ,args ,#body))
Now you should use SBCL, CCL, Allegro CL or LispWorks. These compilers support TCO by default.
Let's use SBCL:
* (define (factorial n)
(fact-iter 1 1 n))
; in: DEFINE (FACTORIAL N)
; (FACT-ITER 1 1 N)
;
; caught STYLE-WARNING:
; undefined function: FACT-ITER
;
; compilation unit finished
; Undefined function:
; FACT-ITER
; caught 1 STYLE-WARNING condition
FACTORIAL
* (define (fact-iter product counter max-count)
(if (> counter max-count)
product
(fact-iter (* counter product)
(+ counter 1)
max-count)))
FACT-ITER
* (factorial 1000)
40238726007709....
Another Example: symbolic differentiation
SICP has a Scheme example for differentiation:
(define (deriv exp var)
(cond ((number? exp) 0)
((variable? exp)
(if (same-variable? exp var) 1 0))
((sum? exp)
(make-sum (deriv (addend exp) var)
(deriv (augend exp) var)))
((product? exp)
(make-sum
(make-product (multiplier exp)
(deriv (multiplicand exp) var))
(make-product (deriv (multiplier exp) var)
(multiplicand exp))))
(else
(error "unknown expression type -- DERIV" exp))))
Making this code run in Common Lisp is easy:
some functions have different names, number? is numberp in CL
CL:COND uses T instead of else
CL:ERROR uses CL format strings
Let's define Scheme names for some functions. Common Lisp code:
(loop for (scheme-symbol fn) in
'((number? numberp)
(symbol? symbolp)
(pair? consp)
(eq? eq)
(display-line print))
do (setf (symbol-function scheme-symbol)
(symbol-function fn)))
Our define macro from above:
(defmacro define ((name &rest args) &body body)
`(defun ,name ,args ,#body))
The Common Lisp code:
(define (variable? x) (symbol? x))
(define (same-variable? v1 v2)
(and (variable? v1) (variable? v2) (eq? v1 v2)))
(define (make-sum a1 a2) (list '+ a1 a2))
(define (make-product m1 m2) (list '* m1 m2))
(define (sum? x)
(and (pair? x) (eq? (car x) '+)))
(define (addend s) (cadr s))
(define (augend s) (caddr s))
(define (product? x)
(and (pair? x) (eq? (car x) '*)))
(define (multiplier p) (cadr p))
(define (multiplicand p) (caddr p))
(define (deriv exp var)
(cond ((number? exp) 0)
((variable? exp)
(if (same-variable? exp var) 1 0))
((sum? exp)
(make-sum (deriv (addend exp) var)
(deriv (augend exp) var)))
((product? exp)
(make-sum
(make-product (multiplier exp)
(deriv (multiplicand exp) var))
(make-product (deriv (multiplier exp) var)
(multiplicand exp))))
(t
(error "unknown expression type -- DERIV: ~a" exp))))
Let's try it in LispWorks:
CL-USER 19 > (deriv '(* (* x y) (+ x 3)) 'x)
(+ (* (* X Y) (+ 1 0)) (* (+ (* X 0) (* 1 Y)) (+ X 3)))
Streams example from SICP in Common Lisp
See the book code in chapter 3.5 in SICP. We use the additions to CL from above.
SICP mentions delay, the-empty-stream and cons-stream, but does not implement it. We provide here an implementation in Common Lisp:
(defmacro delay (expression)
`(lambda () ,expression))
(defmacro cons-stream (a b)
`(cons ,a (delay ,b)))
(define (force delayed-object)
(funcall delayed-object))
(defparameter the-empty-stream (make-symbol "THE-EMPTY-STREAM"))
Now comes portable code from the book:
(define (stream-null? stream)
(eq? stream the-empty-stream))
(define (stream-car stream) (car stream))
(define (stream-cdr stream) (force (cdr stream)))
(define (stream-enumerate-interval low high)
(if (> low high)
the-empty-stream
(cons-stream
low
(stream-enumerate-interval (+ low 1) high))))
Now Common Lisp differs in stream-for-each:
we need to use cl:progn instead of begin
function parameters need to be called with cl:funcall
Here is a version:
(defmacro begin (&body body) `(progn ,#body))
(define (stream-for-each proc s)
(if (stream-null? s)
'done
(begin (funcall proc (stream-car s))
(stream-for-each proc (stream-cdr s)))))
We also need to pass functions using cl:function:
(define (display-stream s)
(stream-for-each (function display-line) s))
But then the example works:
CL-USER 20 > (stream-enumerate-interval 10 20)
(10 . #<Closure 1 subfunction of STREAM-ENUMERATE-INTERVAL 40600010FC>)
CL-USER 21 > (display-stream (stream-enumerate-interval 10 1000))
10
11
12
...
997
998
999
1000
DONE
Do you already know some Common Lisp? I assume that is what you mean by 'Lisp'. In that case you might want to use it instead of Scheme. If you don't know either, and you are working through SICP solely for the learning experience, then probably you are better off with Scheme. It has much better support for new learners, and you won't have to translate from Scheme to Common Lisp.
There are differences; specifically, SICP's highly functional style is wordier in Common Lisp because you have to quote functions when passing them around and use funcall to call a function bound to a variable.
However, if you want to use Common Lisp, you can try using Eli Bendersky's Common Lisp translations of the SICP code under the tag SICP.
They are similar but not the same.
I believe If you go with Scheme it would be easier.
Edit: Nathan Sanders' comment is correct. It's clearly been a while since I last read the book, but I just checked and it does not use call/cc directly. I've upvoted Nathan's answer.
Whatever you use needs to implement continuations, which SICP uses a lot. Not even all Scheme interpreters implement them, and I'm not aware of any Common Lisp that does.