What do common lisp function/special form/macro/etc. names mean and where can I find this information? - lisp

When I was learning HTML it was very helpful for me to know that ol means ordered list, tr is table row, etc. Some of the lisp primitives/forms are easy: funcall should be function call, defmacro - define macro. Some are in the middle - incf is... increment... f??? But because common lisp is so old, this primitives/special forms/etc... don't seem to ring a bell. Can you guys, help me with figuring them out? And even more importantly: Where can I find an authoritative resource on learning the meaning/history behind each and every one of them? (I will accept an answer based on this second question)
The documentation doesn't help me too:
* (describe #'let)
#<CLOSURE (:SPECIAL LET) {10013DC6AB}>
[compiled closure]
Lambda-list: (&REST ARGS)
Derived type: (FUNCTION (&REST T) NIL)
Documentation:
T
Source file: SYS:SRC;COMPILER;INFO-FUNCTIONS.LISP
* (documentation 'let 'function)
"LET ({(var [value]) | var}*) declaration* form*
During evaluation of the FORMS, bind the VARS to the result of evaluating the
VALUE forms. The variables are bound in parallel after all of the VALUES forms
have been evaluated."
* (inspect 'let)
The object is a SYMBOL.
0. Name: "LET"
1. Package: #<PACKAGE "COMMON-LISP">
2. Value: "unbound"
3. Function: #<CLOSURE (:SPECIAL LET) {10013DC6AB}>
4. Plist: (SB-WALKER::WALKER-TEMPLATE SB-WALKER::WALK-LET)
What do the following lisp primitives/special forms/special operators/functions mean?
let, flet
progn
car
cdr
acc
setq, setf
incf
(write more in the comments so we can make a good list!)

let: LET variable-name be bound to a certain value
flet: LET Function-name be bound to a certain function
progn: execute a PROGram sequence and return the Nth value (the last value)
car: Contents of the Address Register (historic)
cdr: Contents of the Decrement Register (historic)
acc: ACCumulator
setq: SET Quote, a variant of the set function, where in setq the user doesn't quote the variable
setf: SET Function quoted, shorter name of the original name setfq. Here the function/place is not evaluated.
incf: INCrement Function quoted, similar to setf. Increments a place.
Other conventions:
Macros / Special Forms who change a place should have an f at the end: setf, psetf, incf, decf, ...
Macros who are DEFining something should have def or define in front: defun, defmethod, defclass, define-method-combination...
Functions who are destructive should have an n in front, for Non-consing: nreverse, ...
Predicates have a p or -p at the end: adjustable-array-p, alpha-char-p,...
special variables have * at the front and back: *standard-output*, ...
There are other naming conventions.

let: Well, that's a normal word, and is used like in maths ("let x = 3 in ...").
flet: I'd guess "function let", because that's what it does.
progn: This is probably related also to prog1 and prog2. I read is as "program whose nth form dictates the result value". The program has n forms, so it's the last one that forms the result value of the progn form.
car and cdr: "Contents address register" resp. "Contents decrement register". This is related to the IBM 704 which Lisp was originally implemented for.
setq: "set quote", originally an abbreviation for (set (quote *abc*) value).
setf: "set field", came up when lexical variables appeared. This is a good read on the set functions.
Where can I find an authoritative resource on learning the meaning/history behind each and every one of them?
The HyperSpec is a good place to start, and ultimately the ANSI standard. Though "Common Lisp The Language" could also shine some light on the history of some names.
(Oh, and you got defmacro wrong, that's "Definition for Mac, Read Only." ;) )

Related

Parameterizable return-from in Common Lisp

I'm learning blocks in Common lisp and did this example to see how blocks and the return-from command work:
(block b1
(print 1)
(print 2)
(print 3)
(block b2
(print 4)
(print 5)
(return-from b1)
(print 6)
)
(print 7))
It will print 1, 2, 3, 4, and 5, as expected. Changing the return-from to (return-from b2) it'll print 1, 2, 3, 4, 5, and 7, as one would expect.
Then I tried turn this into a function and paremetrize the label on the return-from:
(defun test-block (arg) (block b1
(print 1)
(print 2)
(print 3)
(block b2
(print 4)
(print 5)
(return-from (eval arg))
(print 6)
)
(print 7)))
and using (test-block 'b1) to see if it works, but it doesn't. Is there a way to do this without conditionals?
Using a conditional like CASE to select a block to return from
The recommended way to do it is using case or similar. Common Lisp does not support computed returns from blocks. It also does not support computed gos.
Using a case conditional expression:
(defun test-block (arg)
(block b1
(print 1)
(print 2)
(print 3)
(block b2
(print 4)
(print 5)
(case arg
(b1 (return-from b1))
(b2 (return-from b2)))
(print 6))
(print 7)))
One can't compute lexical go tags, return blocks or local functions from names
CLTL2 says about the restriction for the go construct:
Compatibility note: The ``computed go'' feature of MacLisp is not supported. The syntax of a computed go is idiosyncratic, and the feature is not supported by Lisp Machine Lisp, NIL (New Implementation of Lisp), or Interlisp. The computed go has been infrequently used in MacLisp anyway and is easily simulated with no loss of efficiency by using a case statement each of whose clauses performs a (non-computed) go.
Since features like go and return-from are lexically scoped constructs, computing the targets is not supported. Common Lisp has no way to access lexical environments at runtime and query those. This is for example also not supported for local functions. One can't take a name and ask for a function object with that name in some lexical environment.
Dynamic alternative: CATCH and THROW
The typically less efficient and dynamically scoped alternative is catch and throw. There the tags are computed.
I think these sorts of things boils down to the different types of namespaces bindings and environments in Common Lisp.
One first point is that a slightly more experienced novice learning Lisp might try to modify your attempted function to say (eval (list 'return-from ,arg)) instead. This seems to make more sense but still does not work.
Namespaces
A common beginner mistake in a language like scheme is having a variable called list as this shadows the top level definition of this as a function and stops the programmer from being able to make lists inside the scope for this binding. The corresponding mistake in Common Lisp is trying to use a symbol as a function when it is only bound as a variable.
In Common Lisp there are namespaces which are mappings from names to things. Some namespaces are:
The functions. To get the corresponding thing either call it: (foo a b c ...), or get the function for a static symbol (function foo) (aka #'foo) or for a dynamic symbol (fdefinition 'foo). Function names are either symbols or lists of setf and one symbol (e.g. (serf bar)). Symbols may alternatively be bound to macros in this namespace in which case function and fdefinition signal errors.
The variables. This maps symbols to the values in the corresponding variable. This also maps symbols to constants. Get the value of a variable by writing it down, foo or dynamically as (symbol-value). A symbol may also be bound as a symbol-macro in which case special macro expansion rules apply.
Go tags. This maps symbols to labels to which one can go (like goto in other languages).
Blocks. This maps symbols to places you can return from.
Catch tags. This maps objects to the places which catch them. When you throw to an object, the implementation effectively looks up the corresponding catch in this namespace and unwinds the stack to it.
classes (and structs, conditions). Every class has a name which is a symbol (so different packages may have a point class)
packages. Each package is named by a string and possibly some nicknames. This string is normally the name of a symbol and therefore usually in uppercase
types. Every type has a name which is a symbol. Naturally a class definition also defines a type.
declarations. Introduced with declare, declaim, proclaim
there might be more. These are all the ones I can think of.
The catch-tag and declarations namespaces aren’t like the others as they don’t really map symbols to things but they do have bindings and environments in the ways described below (note that I have used declarations to refer to the things that have been declared, like the optimisation policy or which variables are special, rather than the namespace in which e.g. optimize, special, and indeed declaration live which seems too small to include).
Now let’s talk about the different ways that this mapping may happen.
The binding of a name to a thing in a namespace is the way in which they are associated, in particular, how it may come to be and how it may be inspected.
The environment of a binding is the place where the binding lives. It says how long the binding lives for and where it may be accessed from. Environments are searched for to find the thing associated with some name in some namespace.
static and dynamic bindings
We say a binding is static if the name that is bound is fixed in the source code and a binding is dynamic if the name can be determined at run time. For example let, block and tags in a tagbody all introduce static bindings whereas catch and progv introduce dynamic bindings.
Note that my definition for dynamic binding is different from the one in the spec. The spec definition corresponds to my dynamic environment below.
Top level environment
This is the environment where names are searched for last and it is where toplevel definitions go to, for example defvar, defun, defclass operate at this level. This is where names are looked up last after all other applicable environments have been searched, e.g. if a function or variable binding can not be found at a closer level then this level is searched. References can sometimes be made to bindings at this level before they are defined, although they may signal warnings. That is, you may define a function bar which calls foo before you have defined foo. In other cases references are not allowed, for example you can’t try to intern or read a symbol foo::bar before the package FOO has been defined. Many namespaces only allow bindings in the top level environment. These are
constants (within the variables namespace)
classes
packages
types
Although (excepting proclaim) all bindings are static, they can effectively be made dynamic by calling eval which evaluates forms at the top level.
Functions (and [compiler] macros) and special variables (and symbol macros) may also be defined top level. Declarations can be defined toplevel either statically with the macro declaim or dynamically with the function proclaim.
Dynamic environment
A dynamic environment exists for a region of time during the programs execution. In particular, a dynamic environment begins when control flow enters some (specific type of) form and ends when control flow leaves it, either by returning normally or by some nonlocal transfer of control like a return-from or go. To look up a dynamically bound name in a namespace, the currently active dynamic environments are searched (effectively, ie a real system wouldn’t be implemented this way) from most recent to oldest for that name and the first binding wins.
Special variables and catch tags are bound in dynamic environments. Catch tags are bound dynamically using catch while special variables are bound statically using let and dynamically using progv. As we shall discuss later, let can make two different kinds of binding and it knows to treat a symbol as special if it has been defined with defvar or ‘defparameteror if it has been declared asspecial`.
Lexical environment
A lexical environment corresponds to a region of source code as it is written and a specific runtime instantiation of it. It (slightly loosely) begins at an opening parenthesis and ends at the corresponding closing parenthesis, and is instantiated when control flow hits the opening parenthesis. This description is a little complicated so let’s have an example with variables which are bound in a lexically environment (unless they are special. By convention the names special variables are wrapped in * symbols)
(defun foo ()
(let ((x 10))
(bar (lambda () x))))
(defun bar (f)
(let ((x 20))
(funcall f)))
Now what happens when we call (foo)? Well if x were bound in a dynamic environment (in foo and bar) then the anonymous function would be called in bar and the first dynamic environment with a binding for x would have it bound to 20.
But this call returns 10 because x is bound in a lexical environment so even though the anonymous function gets passed to bar, it remembers the lexical environment corresponding to the application of foo which created it and in that lexical environment, x is bound to 10. Let’s now have another example to show what I mean by ‘specific runtime instantiation’ above.
(defun baz (islast)
(let ((x (if islast 10 20)))
(let ((lx (lambda () x)))
(if islast
lx
(frob lx (baz t))))))
(defun frob (a b)
(list (funcall a) (funcall b)))
Now running (baz nil) will give us (20 10) because the first function passed to frob remembers the lexical environment for the outer call to baz (where islast is nil) whilst the second remembers the environment for the inner call.
For variables which are not special, let creates static lexical bindings. Block names (introduced statically by block), go tags (scopes inside a tagbody), functions (by felt or labels), macros (macrolet), and symbol macros (symbol-macrolet) are all bound statically in lexical environments. Bindings from a lambda list are also lexically bound. Declarations can be created lexically using (declare ...) in one of the allowed places or by using (locally (declare ...) ...) anywhere.
We note that all lexical bindings are static. The eval trick described above does not work because eval happens in the toplevel environment but references to lexical names happen in the lexical environment. This allows the compiler to optimise references to them to know exactly where they are without running code having to carry around a list of bindings or accessing global state (e.g. lexical variables can live in registers and the stack). It also allows the compiler to work out which bindings can escape or be captured in closures or not and optimise accordingly. The one exception is that the (symbol-)macro bindings can be dynamically inspected in a sense as all macros may take an &environment parameter which should be passed to macroexpand (and other expansion related functions) to allow the macroexpander to search the compile-time lexical environment for the macro definitions.
Another thing to note is that without lambda-expressions, lexical and dynamic environments would behave the same way. But note that if there were only a top level environment then recursion would not work as bindings would not be restored as control flow leaves their scope.
Closure
What happens to a lexical binding captured by an anonymous function when that function escapes the scope it was created in? Well there are two things that can happen
Trying to access the binding results in an error
The anonymous function keeps the lexical environment alive for as long as the functions referencing it are alive and they can read and write it as they please.
The second case is called a closure and happens for functions and variables. The first case happens for control flow related bindings because you can’t return from a form that has already returned. Neither happens for macro bindings as they cannot be accessed at run time.
Nonlocal control flow
In a language like Java, control (that is, program execution) flows from one statement to the next, branching for if and switch statements, looping for others with special statements like break and return for certain kinds of jumping. For functions control flow goes into the function until it eventually comes out again when the function returns. The one nonlocal way to transfer control is by using throw and try/catch where if you execute a throw then the stack is unwound piece by piece until a suitable catch is found.
In C there are is no throw or try/catch but there is goto. The structure of C programs is secretly flat with the nesting just specifying that “blocks” end in the opposite order to the order they start. What I mean by this is that it is perfectly legal to have a while loop in the middle of a switch with cases inside the loop and it is legal to goto the middle of a loop from outside of that loop. There is a way to do nonlocal control transfer in C: you use setjmp to save the current control state somewhere (with the return value indicating whether you have successfully saved the state or just nonlocally returned there) and longjmp to return control flow to a previously saved state. No real cleanup or freeing of memory happens as the stack unwinds and there needn’t be checks that you still have the function which called setjmp on the callstack so the whole thing can be quite dangerous.
In Common Lisp there’s a range of ways to do nonlocal control transfer but the rules are more strict. Lisp doesn’t really have statements but rather everything is built out of a tree of expressions and so the first rule is that you can’t nonlocally transfer control into a deeper expression, you may only transfer out. Let’s look at how these different methods of control transfer work.
block and return-from
You’ve already seen how these work inside a single function but recall that I said block names are lexically scoped. So how does this interact with anonymous functions?
Well suppose you want to search some big nested data structure for something. If you were writing this function in Java or C then you might implement a special search function to recurse through your data structure until it finds the right thing and then return it all the way up. If you were implementing it in Haskell then you would probably want to do it as some kind of fold and rely on lazy evaluation to not do too much work. In Common Lisp you might have a function which applies some other function passed as a parameter to each item in the data structure. And now you can call that with a searching function. How might you get the result out? Well just return-from to the outer block.
tagbody and go
A tagbody is like a progn but instead of evaluating single symbols in the body, they are called tags and any expression within the tagbody can go to them to transfer control to it. This is partly like goto, if you’re still in the same function but if your go expression happens inside some anonymous function then it’s like a safe lexically scoped longjmp.
catch and throw
These are most similar to the Java model. The key difference between block and catch is that block uses lexical scoping and catch uses dynamic scoping. Therefore their relationship is like that between special and regular variables.
Finally
In Java one can execute code to tidy things up if the stack has to unwind through it as an exception is thrown. This is done with try/finally. The Common Lisp equivalent is called unwind-protect which ensures a form is executed however control flow may leave it.
Errors
It’s perhaps worth looking a little at how errors work in Common Lisp. Which of these methods do they use?
Well it turns out that the answer is that errors instead of generally unwinding the stack start by calling functions. First they look up all the possible restarts (ways to deal with an error) and save them somewhere. Next they look up all applicable handlers (a list of handlers could, for example, be stored in a special variable as handlers have dynamic scope) and try each one at a time. A handler is just a function so it might return (ie not want to handle the error) or it might not return. A handler might not return if it invokes a restart. But restarts are just normal functions so why might these not return? Well restarts are created in a dynamic environment below the one where the error was raised and so they can transfer control straight out of the handler and the code that threw the error to some code to try to do something and then carry on. Restarts can transfer control using go or return-from. It is worth noting that it is important here that we have lexical scope. A recursive function could define a restart on each successive call and so it is necessary to have lexical scope for variables and tags/block names so that we can make sure we transfer control to the right level on the call stack with the right state.

Evaluating arguments of Clojure tagged literals

I have been trying to use Clojure tagged literals, and noticed that the reader does not evaluate the arguments, very much like a macro. This makes sense, but what is the appropriate solution for doing this? Explicit eval?
Example: given this function
(defn my-data
([[arg]]
(prn (symbol? arg))
:ok))
and this definition data_readers.clj
{myns/my-data my.ns/my-data}
The following behave differently:
> (let [x 1] (my.ns/my-data [x]))
false
:ok
So the x passed in is evaluated before being passed into my-data. On the other hand:
> (let [x 1] #myns/my-data [x])
true
:ok
So if I want to use the value of x inside my-data, the my-data function needs to do something about it, namely check that x is a symbol, and if so, use (eval x). That seems ugly. Is there a better approach?
Summary
There is no way to get at the value of the local x in your example, primarily because locals only get assigned values at runtime, whereas tagged literals are handled at read time. (There's also compile time in between; it is impossible to get at locals' values at compile time; therefore macros cannot get at locals' values either.)1
The better approach is to use a regular function at runtime, since after all you want to construct a value based on the runtime values of some parameters. Tagged literals really are literals and should be used as such.
Extended discussion
To illustrate the issue described above:
(binding [*data-readers* {'bar (fn [_] (java.util.Date.))}]
(eval (read-string "(defn foo [] #bar x)")))
foo will always return the same value, because the reader has only one opportunity to return a value for #bar x which is then baked into foo's bytecode.
Notice also that instead of passing it to eval directly, we could store the data structure returned by the call to read-string and compile it at an arbitrary point in the future; the value returned by foo would remain the same. Clearly there's no way for such a literal value to depend on the future values of any locals; in fact, during the reader's operation, it is not even clear which symbols will come to name locals -- that's for the compiler to determine and the result of that determination may be non-obvious in cases where macros are involved.
Of course the reader is free to return a form which looks like a function call, the name of a local etc. To show one example:
(binding [*data-readers* {'bar (fn [sym] (list sym 1 2 3 4 5))}]
(eval (read-string "#bar *")))
;= 120
;; substituting + for * in the string yields a value of 15
Here #bar f becomes equivalent to (f 1 2 3 4 5). Needless to say, this is an abuse of notation and doesn't really do what you asked for.
1 It's worth pointing out that eval has no access to locals (it always operates in the global scope), but the issue with locals not having values assigned before runtime is more fundamental.

How do keywords work in Common Lisp?

The question is not about using keywords, but actually about keyword implementation. For example, when I create some function with keyword parameters and make a call:
(defun fun (&key key-param) (print key-param)) => FUN
(find-symbol "KEY-PARAM" 'keyword) => NIL, NIL ;;keyword is not still registered
(fun :key-param 1) => 1
(find-symbol "KEY-PARAM" 'keyword) => :KEY-PARAM, :EXTERNAL
How a keyword is used to pass an argument? Keywords are symbols whos values are themselves, so how a corresponding parameter can be bound using a keyword?
Another question about keywords — keywords are used to define packages. We can define a package named with already existing keyword:
(defpackage :KEY-PARAM) => #<The KEY-PARAMETER package, 0/16 ...
(in-package :KEY-PARAM) => #<The KEY-PARAMETER package, 0/16 ...
(defun fun (&key key-param) (print key-param)) => FUN
(fun :KEY-PARAM 1) => 1
How does system distinguish the usage of :KEY-PARAM between package name and function parameter name?
Also we can make something more complicated, if we define function KEY-PARAM and export it (actually not function, but name):
(in-package :KEY-PARAM)
(defun KEY-PARAM (&key KEY-PARAM) KEY-PARAM) => KEY-PARAM
(defpackage :KEY-PARAM (:export :KEY-PARAM))
;;exporting function KEY-PARAM, :KEY-PARAM keyword is used for it
(in-package :CL-USER) => #<The COMMON-LISP-USER package, ...
(KEY-PARAM:KEY-PARAM :KEY-PARAM 1) => 1
;;calling a function KEY-PARAM from :KEY-PARAM package with :KEY-PARAM parameter...
The question is the same, how Common Lisp does distinguish the usage of keyword :KEY-PARAM here?
If there is some manual about keywords in Common Lisp with explanation of their mechanics, I would be grateful if you posted a link here, because I could find just some short articles only about usage of the keywords.
See Common Lisp Hyperspec for full details of keyword parameters. Notice that
(defun fun (&key key-param) ...)
is actually short for:
(defun fun (&key ((:key-param key-param)) ) ...)
The full syntax of a keyword parameter is:
((keyword-name var) default-value supplied-p-var)
default-value and supplied-p-var are optional. While it's conventional to use a keyword symbol as the keyword-name, it's not required; if you just specify a var instead of (keyword-name var), it defaults keyword-name to being a symbol in the keyword package with the same name as var.
So, for example, you could do:
(defun fun2 (&key ((myoption var))) (print var))
and then call it as:
(fun 'myoption 3)
The way it works internally is when the function is being called, it steps through the argument list, collecting pairs of arguments <label, value>. For each label, it looks in the parameter list for a parameter with that keyword-name, and binds the corresponding var to value.
The reason we normally use keywords is because the : prefix stands out. And these variables have been made self-evaluating so that we don't have to quote them as well, i.e. you can write :key-param instead of ':key-param (FYI, this latter notation was necessary in earlier Lisp systems, but the CL designers decided it was ugly and redundant). And we don't normally use the ability to specify a keyword with a different name from the variable because it would be confusing. It was done this way for full generality. Also, allowing regular symbols in place of keywords is useful for facilities like CLOS, where argument lists get merged and you want to avoid conflicts -- if you're extending a generic function you can add parameters whose keywords are in your own package and there won't be collisions.
The use of keyword arguments when defining packages and exporting variables is again just a convention. DEFPACKAGE, IN-PACKAGE, EXPORT, etc. only care about the names they're given, not what package it's in. You could write
(defpackage key-param)
and it would usually work just as well. The reason many programmers don't do this is because it interns a symbol in their own package, and this can sometimes cause package conflicts if this happens to have the same name as a symbol they're trying to import from another package. Using keywords divorces these parameters from the application's package, avoiding potential problems like this.
The bottom line is: when you're using a symbol and you only care about its name, not its identity, it's often safest to use a keyword.
Finally, about distinguishing keywords when they're used in different ways. A keyword is just a symbol. If it's used in a place where the function or macro just expects an ordinary parameter, the value of that parameter will be the symbol. If you're calling a function that has &key arguments, that's the only time they're used as labels to associate arguments with parameters.
The good manual is Chapter 21 of PCL.
Answering your questions briefly:
keywords are exported symbols in keyword package, so you can refer to them not only as :a, but also as keyword:a
keywords in function argument lists (called lambda-lists) are, probably, implemented in the following way. In presence of &key modifier the lambda form is expanded into something similar to this:
(let ((key-param (getf args :key-param)))
body)
when you use a keyword to name a package it is actually used as a string-designator. This is a Lisp concept that allows to pass to a certain function dealing with strings, that are going to be used as symbols later (for different names: of packages, classes, functions, etc.) not only strings, but also keywords and symbols. So, the basic way to define/use package is actually this:
(defpackage "KEY-PARAM" ...)
But you can as well use:
(defpackage :key-param ...)
and
(defpackage #:key-param ...)
(here #: is a reader macro to create uninterned symbols; and this way is the preferred one, because you don't create unneeded keywords in the process).
The latter two forms will be converted to upper-case strings. So a keyword stays a keyword, and a package gets its named as string, converted from that keyword.
To sum up, keywords have the value of themselves, as well as any other symbols. The difference is that keywords don't require explicit qualification with keyword package or its explicit usage. And as other symbols they can serve as names for objects. Like, for example, you can name a function with a keyword and it will be "magically" accessible in every package :) See #Xach's blogpost for details.
There is no need for "the system" to distinguish different uses of keywords. They are just used as names. For example, imagine two plists:
(defparameter *language-scores* '(:basic 0 :common-lisp 5 :python 3))
(defparameter *price* '(:basic 100 :fancy 500))
A function yielding the score of a language:
(defun language-score (language &optional (language-scores *language-scores*))
(getf language-scores language))
The keywords, when used with language-score, designate different programming languages:
CL-USER> (language-score :common-lisp)
5
Now, what does the system do to distinguish the keywords in *language-scores* from those in *price*? Absolutely nothing. The keywords are just names, designating different things in different data structures. They are no more distinguished than homophones are in natural language – their use determines what they mean in a given context.
In the above example, nothing prevents us from using the function with a wrong context:
(language-score :basic *prices*)
100
The language did nothing to prevent us from doing this, and the keywords for the not-so-fancy programming language and the not-so-fancy product are just the same.
There are many possibilities to prevent this: Not allowing the optional argument for language-score in the first place, putting *prices* in another package without externing it, closing over a lexical binding instead of using the globally special *language-scores* while only exposing means to add and retrieve entries. Maybe just our understanding of the code base or convention are enough to prevent us from doing that. The point is: The system's distinguishing the keywords themselves is not necessary to achieve what we want to implement.
The specific keyword uses you ask about are not different: The implementation might store the bindings of keyword-arguments in an alist, a plist, a hash-table or whatever. In the case of package names, the keywords are just used as package designators and the package name as a string (in uppercase) might just be used instead. Whether the implementation converts strings to keywords, keywords to strings, or something entirely different internally doesn't really matter. What matters is just the name, and in which context it is used.

Strange Lisp Quoting scenario - Graham's On Lisp, page 37

I'm working my way through Graham's book "On Lisp" and can't understand the following example at page 37:
If we define exclaim so that its return value
incorporates a quoted list,
(defun exclaim (expression)
(append expression ’(oh my)))
> (exclaim ’(lions and tigers and bears))
(LIONS AND TIGERS AND BEARS OH MY)
> (nconc * ’(goodness))
(LIONS AND TIGERS AND BEARS OH MY GOODNESS)
could alter the list within the function:
> (exclaim ’(fixnums and bignums and floats))
(FIXNUMS AND BIGNUMS AND FLOATS OH MY GOODNESS)
To make exclaim proof against such problems, it should be written:
(defun exclaim (expression)
(append expression (list ’oh ’my)))
Does anyone understand what's going on here? This is seriously screwing with my mental model of what quoting does.
nconc is a destructive operation that alters its first argument by changing its tail. In this case, it means that the constant list '(oh my) gets a new tail.
To hopefully make this clearer. It's a bit like this:
; Hidden variable inside exclaim
oh_my = oh → my → nil
(exclaim '(lions and tigers and bears)) =
lions → and → tigers → and → bears → oh_my
(nconc * '(goodness)) destructively appends goodness to the last result:
lions → and → tigers → and → bears → oh → my → goodness → nil
so now, oh_my = oh → my → goodness → nil
Replacing '(oh my) with (list 'oh 'my) fixes this because there is no longer a constant being shared by all and sundry. Each call to exclaim generates a new list (the list function's purpose in life is to create brand new lists).
The observation that your mental model of quoting may be flawed is an excellent one—although it may or may not apply depending on what that mental model is.
First, remember that there are various stages to program execution. A Lisp environment must first read the program text into data structures (lists, symbols, and various literal data such as strings and numbers). Next, it may or may not compile those data structures into machine code or some sort of intermediary format. Finally, the resulting code is evaluated (in the case of machine code, of course, this may simply mean jumping to the appropriate address).
Let's put the issue of compilation aside for now and focus on the reading and evaluation stages, assuming (for simplicity) that the evaluator's input is the list of data structures read by the reader.
Consider a form (QUOTE x) where x is some textual representation of an object. This may be symbol literal as in (QUOTE ABC), a list literal as in (QUOTE (A B C)), a string literal as in (QUOTE "abc"), or any other kind of literal. In the reading stage, the reader will read the form as a list (call it form1) whose first element is the symbol QUOTE and whose second element is the object x' whose textual representation is x. Note that I'm specifically saying that the object x' is stored within the list that represents the expression, i.e. in a sense, it's stored as a part of the code itself.
Now it's the evaluator's turn. The evaluator's input is form1, which is a list. So it looks at the first element of form1, and, having determined that it is the symbol QUOTE, it returns as the result of the evaluation the second element of the list. This is the crucial point. The evaluator returns the second element of the list to be evaluated, which is what the reader read in in the first execution stage (prior to compilation!). That's all it does. There's no magic to it, it's very simple, and significantly, no new objects are created and no existing ones are copied.
Therefore, whenever you modify a “quoted list”, you're modifying the code itself. Self-modifying code is a very confusing thing, and in this case, the behaviour is actually undefined (because ANSI Common Lisp permits implementations to put code in read-only memory).
Of course, the above is merely a mental model. Implementations are free to implement the model in various ways, and in fact, I know of no implementation of Common Lisp that, like my explanation, does no compilation at all. Still, this is the basic idea.
In Common Lisp.
Remember:
'(1 2 3 4)
Above is a literal list. Constant data.
(list 1 2 3 4)
LIST is a function that when call returns a fresh new list with its arguments as list elements.
Avoid modifying literal lists. The effects are not standardized. Imagine a Lisp that compiles all constant data into a read only memory area. Imagine a Lisp that takes constant lists and shares them across functions.
(defun a () '(1 2 3)
(defun b () '(1 2 3))
A Lisp compiler may create one list that is shared by both functions.
If you modify the list returned by function a
it might not be changed
it might be changed
it might be an error
it might also change the list returned by function b
Implementations have the freedom to do what they like. This leaves room for optimizations.

What makes Lisp macros so special?

Reading Paul Graham's essays on programming languages one would think that Lisp macros are the only way to go. As a busy developer, working on other platforms, I have not had the privilege of using Lisp macros. As someone who wants to understand the buzz, please explain what makes this feature so powerful.
Please also relate this to something I would understand from the worlds of Python, Java, C# or C development.
To give the short answer, macros are used for defining language syntax extensions to Common Lisp or Domain Specific Languages (DSLs). These languages are embedded right into the existing Lisp code. Now, the DSLs can have syntax similar to Lisp (like Peter Norvig's Prolog Interpreter for Common Lisp) or completely different (e.g. Infix Notation Math for Clojure).
Here is a more concrete example:Python has list comprehensions built into the language. This gives a simple syntax for a common case. The line
divisibleByTwo = [x for x in range(10) if x % 2 == 0]
yields a list containing all even numbers between 0 and 9. Back in the Python 1.5 days there was no such syntax; you'd use something more like this:
divisibleByTwo = []
for x in range( 10 ):
if x % 2 == 0:
divisibleByTwo.append( x )
These are both functionally equivalent. Let's invoke our suspension of disbelief and pretend Lisp has a very limited loop macro that just does iteration and no easy way to do the equivalent of list comprehensions.
In Lisp you could write the following. I should note this contrived example is picked to be identical to the Python code not a good example of Lisp code.
;; the following two functions just make equivalent of Python's range function
;; you can safely ignore them unless you are running this code
(defun range-helper (x)
(if (= x 0)
(list x)
(cons x (range-helper (- x 1)))))
(defun range (x)
(reverse (range-helper (- x 1))))
;; equivalent to the python example:
;; define a variable
(defvar divisibleByTwo nil)
;; loop from 0 upto and including 9
(loop for x in (range 10)
;; test for divisibility by two
if (= (mod x 2) 0)
;; append to the list
do (setq divisibleByTwo (append divisibleByTwo (list x))))
Before I go further, I should better explain what a macro is. It is a transformation performed on code by code. That is, a piece of code, read by the interpreter (or compiler), which takes in code as an argument, manipulates and the returns the result, which is then run in-place.
Of course that's a lot of typing and programmers are lazy. So we could define DSL for doing list comprehensions. In fact, we're using one macro already (the loop macro).
Lisp defines a couple of special syntax forms. The quote (') indicates the next token is a literal. The quasiquote or backtick (`) indicates the next token is a literal with escapes. Escapes are indicated by the comma operator. The literal '(1 2 3) is the equivalent of Python's [1, 2, 3]. You can assign it to another variable or use it in place. You can think of `(1 2 ,x) as the equivalent of Python's [1, 2, x] where x is a variable previously defined. This list notation is part of the magic that goes into macros. The second part is the Lisp reader which intelligently substitutes macros for code but that is best illustrated below:
So we can define a macro called lcomp (short for list comprehension). Its syntax will be exactly like the python that we used in the example [x for x in range(10) if x % 2 == 0] - (lcomp x for x in (range 10) if (= (% x 2) 0))
(defmacro lcomp (expression for var in list conditional conditional-test)
;; create a unique variable name for the result
(let ((result (gensym)))
;; the arguments are really code so we can substitute them
;; store nil in the unique variable name generated above
`(let ((,result nil))
;; var is a variable name
;; list is the list literal we are suppose to iterate over
(loop for ,var in ,list
;; conditional is if or unless
;; conditional-test is (= (mod x 2) 0) in our examples
,conditional ,conditional-test
;; and this is the action from the earlier lisp example
;; result = result + [x] in python
do (setq ,result (append ,result (list ,expression))))
;; return the result
,result)))
Now we can execute at the command line:
CL-USER> (lcomp x for x in (range 10) if (= (mod x 2) 0))
(0 2 4 6 8)
Pretty neat, huh? Now it doesn't stop there. You have a mechanism, or a paintbrush, if you like. You can have any syntax you could possibly want. Like Python or C#'s with syntax. Or .NET's LINQ syntax. In end, this is what attracts people to Lisp - ultimate flexibility.
You will find a comprehensive debate around lisp macro here.
An interesting subset of that article:
In most programming languages, syntax is complex. Macros have to take apart program syntax, analyze it, and reassemble it. They do not have access to the program's parser, so they have to depend on heuristics and best-guesses. Sometimes their cut-rate analysis is wrong, and then they break.
But Lisp is different. Lisp macros do have access to the parser, and it is a really simple parser. A Lisp macro is not handed a string, but a preparsed piece of source code in the form of a list, because the source of a Lisp program is not a string; it is a list. And Lisp programs are really good at taking apart lists and putting them back together. They do this reliably, every day.
Here is an extended example. Lisp has a macro, called "setf", that performs assignment. The simplest form of setf is
(setf x whatever)
which sets the value of the symbol "x" to the value of the expression "whatever".
Lisp also has lists; you can use the "car" and "cdr" functions to get the first element of a list or the rest of the list, respectively.
Now what if you want to replace the first element of a list with a new value? There is a standard function for doing that, and incredibly, its name is even worse than "car". It is "rplaca". But you do not have to remember "rplaca", because you can write
(setf (car somelist) whatever)
to set the car of somelist.
What is really happening here is that "setf" is a macro. At compile time, it examines its arguments, and it sees that the first one has the form (car SOMETHING). It says to itself "Oh, the programmer is trying to set the car of somthing. The function to use for that is 'rplaca'." And it quietly rewrites the code in place to:
(rplaca somelist whatever)
Common Lisp macros essentially extend the "syntactic primitives" of your code.
For example, in C, the switch/case construct only works with integral types and if you want to use it for floats or strings, you are left with nested if statements and explicit comparisons. There's also no way you can write a C macro to do the job for you.
But, since a lisp macro is (essentially) a lisp program that takes snippets of code as input and returns code to replace the "invocation" of the macro, you can extend your "primitives" repertoire as far as you want, usually ending up with a more readable program.
To do the same in C, you would have to write a custom pre-processor that eats your initial (not-quite-C) source and spits out something that a C compiler can understand. It's not a wrong way to go about it, but it's not necessarily the easiest.
Lisp macros allow you to decide when (if at all) any part or expression will be evaluated. To put a simple example, think of C's:
expr1 && expr2 && expr3 ...
What this says is: Evaluate expr1, and, should it be true, evaluate expr2, etc.
Now try to make this && into a function... thats right, you can't. Calling something like:
and(expr1, expr2, expr3)
Will evaluate all three exprs before yielding an answer regardless of whether expr1 was false!
With lisp macros you can code something like:
(defmacro && (expr1 &rest exprs)
`(if ,expr1 ;` Warning: I have not tested
(&& ,#exprs) ; this and might be wrong!
nil))
now you have an &&, which you can call just like a function and it won't evaluate any forms you pass to it unless they are all true.
To see how this is useful, contrast:
(&& (very-cheap-operation)
(very-expensive-operation)
(operation-with-serious-side-effects))
and:
and(very_cheap_operation(),
very_expensive_operation(),
operation_with_serious_side_effects());
Other things you can do with macros are creating new keywords and/or mini-languages (check out the (loop ...) macro for an example), integrating other languages into lisp, for example, you could write a macro that lets you say something like:
(setvar *rows* (sql select count(*)
from some-table
where column1 = "Yes"
and column2 like "some%string%")
And thats not even getting into Reader macros.
Hope this helps.
I don't think I've ever seen Lisp macros explained better than by this fellow: http://www.defmacro.org/ramblings/lisp.html
A lisp macro takes a program fragment as input. This program fragment is represented a data structure which can be manipulated and transformed any way you like. In the end the macro outputs another program fragment, and this fragment is what is executed at runtime.
C# does not have a macro facility, however an equivalent would be if the compiler parsed the code into a CodeDOM-tree, and passed that to a method, which transformed this into another CodeDOM, which is then compiled into IL.
This could be used to implement "sugar" syntax like the for each-statement using-clause, linq select-expressions and so on, as macros that transforms into the underlying code.
If Java had macros, you could implement Linq syntax in Java, without needing Sun to change the base language.
Here is pseudo-code for how a lisp-style macro in C# for implementing using could look:
define macro "using":
using ($type $varname = $expression) $block
into:
$type $varname;
try {
$varname = $expression;
$block;
} finally {
$varname.Dispose();
}
Since the existing answers give good concrete examples explaining what macros achieve and how, perhaps it'd help to collect together some of the thoughts on why the macro facility is a significant gain in relation to other languages; first from these answers, then a great one from elsewhere:
... in C, you would have to write a custom pre-processor [which would probably qualify as a sufficiently complicated C program] ...
—Vatine
Talk to anyone that's mastered C++ and ask them how long they spent learning all the template fudgery they need to do template metaprogramming [which is still not as powerful].
—Matt Curtis
... in Java you have to hack your way with bytecode weaving, although some frameworks like AspectJ allows you to do this using a different approach, it's fundamentally a hack.
—Miguel Ping
DOLIST is similar to Perl's foreach or Python's for. Java added a similar kind of loop construct with the "enhanced" for loop in Java 1.5, as part of JSR-201. Notice what a difference macros make. A Lisp programmer who notices a common pattern in their code can write a macro to give themselves a source-level abstraction of that pattern. A Java programmer who notices the same pattern has to convince Sun that this particular abstraction is worth adding to the language. Then Sun has to publish a JSR and convene an industry-wide "expert group" to hash everything out. That process--according to Sun--takes an average of 18 months. After that, the compiler writers all have to go upgrade their compilers to support the new feature. And even once the Java programmer's favorite compiler supports the new version of Java, they probably ''still'' can't use the new feature until they're allowed to break source compatibility with older versions of Java. So an annoyance that Common Lisp programmers can resolve for themselves within five minutes plagues Java programmers for years.
—Peter Seibel, in "Practical Common Lisp"
Think of what you can do in C or C++ with macros and templates. They're very useful tools for managing repetitive code, but they're limited in quite severe ways.
Limited macro/template syntax restricts their use. For example, you can't write a template which expands to something other than a class or a function. Macros and templates can't easily maintain internal data.
The complex, very irregular syntax of C and C++ makes it difficult to write very general macros.
Lisp and Lisp macros solve these problems.
Lisp macros are written in Lisp. You have the full power of Lisp to write the macro.
Lisp has a very regular syntax.
Talk to anyone that's mastered C++ and ask them how long they spent learning all the template fudgery they need to do template metaprogramming. Or all the crazy tricks in (excellent) books like Modern C++ Design, which are still tough to debug and (in practice) non-portable between real-world compilers even though the language has been standardised for a decade. All of that melts away if the langauge you use for metaprogramming is the same language you use for programming!
I'm not sure I can add some insight to everyone's (excellent) posts, but...
Lisp macros work great because of the Lisp syntax nature.
Lisp is an extremely regular language (think of everything is a list); macros enables you to treat data and code as the same (no string parsing or other hacks are needed to modify lisp expressions). You combine these two features and you have a very clean way to modify code.
Edit: What I was trying to say is that Lisp is homoiconic, which means that the data structure for a lisp program is written in lisp itself.
So, you end up with a way of creating your own code generator on top of the language using the language itself with all its power (eg. in Java you have to hack your way with bytecode weaving, although some frameworks like AspectJ allows you to do this using a different approach, it's fundamentally a hack).
In practice, with macros you end up building your own mini-language on top of lisp, without the need to learn additional languages or tooling, and with using the full power of the language itself.
Lisp macros represents a pattern that occurs in almost any sizeable programming project. Eventually in a large program you have a certain section of code where you realize it would be simpler and less error prone for you to write a program that outputs source code as text which you can then just paste in.
In Python objects have two methods __repr__ and __str__. __str__ is simply the human readable representation. __repr__ returns a representation that is valid Python code, which is to say, something that can be entered into the interpreter as valid Python. This way you can create little snippets of Python that generate valid code that can be pasted into your actually source.
In Lisp this whole process has been formalized by the macro system. Sure it enables you to create extensions to the syntax and do all sorts of fancy things, but it's actual usefulness is summed up by the above. Of course it helps that the Lisp macro system allows you to manipulate these "snippets" with the full power of the entire language.
In short, macros are transformations of code. They allow to introduce many new syntax constructs. E.g., consider LINQ in C#. In lisp, there are similar language extensions that are implemented by macros (e.g., built-in loop construct, iterate). Macros significantly decrease code duplication. Macros allow embedding «little languages» (e.g., where in c#/java one would use xml to configure, in lisp the same thing can be achieved with macros). Macros may hide difficulties of using libraries usage.
E.g., in lisp you can write
(iter (for (id name) in-clsql-query "select id, name from users" on-database *users-database*)
(format t "User with ID of ~A has name ~A.~%" id name))
and this hides all the database stuff (transactions, proper connection closing, fetching data, etc.) whereas in C# this requires creating SqlConnections, SqlCommands, adding SqlParameters to SqlCommands, looping on SqlDataReaders, properly closing them.
While the above all explains what macros are and even have cool examples, I think the key difference between a macro and a normal function is that LISP evaluates all the parameters first before calling the function. With a macro it's the reverse, LISP passes the parameters unevaluated to the macro. For example, if you pass (+ 1 2) to a function, the function will receive the value 3. If you pass this to a macro, it will receive a List( + 1 2). This can be used to do all kinds of incredibly useful stuff.
Adding a new control structure, e.g. loop or the deconstruction of a list
Measure the time it takes to execute a function passed in. With a function the parameter would be evaluated before control is passed to the function. With the macro, you can splice your code between the start and stop of your stopwatch. The below has the exact same code in a macro and a function and the output is very different. Note: This is a contrived example and the implementation was chosen so that it is identical to better highlight the difference.
(defmacro working-timer (b)
(let (
(start (get-universal-time))
(result (eval b))) ;; not splicing here to keep stuff simple
((- (get-universal-time) start))))
(defun my-broken-timer (b)
(let (
(start (get-universal-time))
(result (eval b))) ;; doesn't even need eval
((- (get-universal-time) start))))
(working-timer (sleep 10)) => 10
(broken-timer (sleep 10)) => 0
One-liner answer:
Minimal syntax => Macros over Expressions => Conciseness => Abstraction => Power
Lisp macros do nothing more than writing codes programmatically. That is, after expanding the macros, you got nothing more than Lisp code without macros. So, in principle, they achieve nothing new.
However, they differ from macros in other programming languages in that they write codes on the level of expressions, whereas others' macros write codes on the level of strings. This is unique to lisp thanks to their parenthesis; or put more precisely, their minimal syntax which is possible thanks to their parentheses.
As shown in many examples in this thread, and also Paul Graham's On Lisp, lisp macros can then be a tool to make your code much more concise. When conciseness reaches a point, it offers new levels of abstractions for codes to be much cleaner. Going back to the first point again, in principle they do not offer anything new, but that's like saying since paper and pencils (almost) form a Turing machine, we do not need an actual computer.
If one knows some math, think about why functors and natural transformations are useful ideas. In principle, they do not offer anything new. However by expanding what they are into lower-level math you'll see that a combination of a few simple ideas (in terms of category theory) could take 10 pages to be written down. Which one do you prefer?
I got this from the common lisp cookbook and I think it explained why lisp macros are useful.
"A macro is an ordinary piece of Lisp code that operates on another piece of putative Lisp code, translating it into (a version closer to) executable Lisp. That may sound a bit complicated, so let's give a simple example. Suppose you want a version of setq that sets two variables to the same value. So if you write
(setq2 x y (+ z 3))
when z=8 both x and y are set to 11. (I can't think of any use for this, but it's just an example.)
It should be obvious that we can't define setq2 as a function. If x=50 and y=-5, this function would receive the values 50, -5, and 11; it would have no knowledge of what variables were supposed to be set. What we really want to say is, When you (the Lisp system) see (setq2 v1 v2 e), treat it as equivalent to (progn (setq v1 e) (setq v2 e)). Actually, this isn't quite right, but it will do for now. A macro allows us to do precisely this, by specifying a program for transforming the input pattern (setq2 v1 v2 e)" into the output pattern (progn ...)."
If you thought this was nice you can keep on reading here:
http://cl-cookbook.sourceforge.net/macros.html
In python you have decorators, you basically have a function that takes another function as input. You can do what ever you want: call the function, do something else, wrap the function call in a resource acquire release, etc. but you don't get to peek inside that function. Say we wanted to make it more powerful, say your decorator received the code of the function as a list then you could not only execute the function as is but you can now execute parts of it, reorder lines of the function etc.