Lets say i have fn begin: (begin "hello"). The result of `(begin "hello") will be (my-ns/begin "hello"). It's good. But now I do this: (def x '(begin "hello")). How can I expand x with backquote to get (my-ns/begin "hello"), not just (begin "hello")?
In the first example you used ` which is a reader macro named "syntax-quote", In the second example you used ' Which is a reader macro named "quote". Syntax-quote offers a couple of features beyond what quote offers:
unquoting with ~, ~#, etc
namespace expansion of symbols for use in writing hygenic macros.
Plain old quote does neither of these. So if you want namespace expansion in both examples use ` in both places. If you want to get an un-namespaced symbol in a syntax-quooted list you can use both of them together `(println ~'symbol-name) which will evaluate to simply (println symbol-name). (in this case symbol-name needs to be defined in the namespace where this is called (this process in known as "symbol capture")). The syntax-quote evaluates first to a call to quote, when in turn evaluates to the symbol.
If you are looking to expand a symbol that already exists for some reason, or you want to expand it in several different namespaces that both define it, you can use the ns-resolve function:
(ns-resolve *ns* 'begin)
so in your example you can map this over the list to print it with the NS qualified symbols:
user> (map #(if (symbol? %) (ns-resolve *ns* %) %) x)
(#'user/begin "hello")
Though this is not exactly the desired output because it refers to the var in that namespace rather than the symbol that resolves to this var. Things get a little dicey here because ` is a reader macro and we can't build calls to it in other macros. For instance I can think of no way to write:
(defmacro expand-x [thing]
`(syntax-quote-here ~thing)
Because as far as I know syntax-quote doesn't have a name (again because it is a reader macro) so there is nothing to use in place of syntax-quote-here in the above example. Though never fear, in a language with eval nothing is impossible, things just get arbitrarily ugly:
user> (eval (read-string (str "`" x)))
(user/begin "hello")
PS: don't actually use this last example or gremlins will inhabit your code for all time
Related
AND and OR are macros and since macros aren't first class in scheme/racket they cannot be passed as arguments to other functions. A partial solution is to use and-map or or-map. Is it possible to write a function that would take arbitrary macro and turn it into a function so that it can be passed as an argument to another function? Are there any languages that have first class macros?
In general, no. Consider that let is (or could be) implemented as a macro on top of lambda:
(let ((x 1))
(foo x))
could be a macro that expands to
((lambda (x) (foo x)) 1)
Now, what would it look like to convert let to a function? Clearly it is nonsense. What would its inputs be? Its return value?
Many macros will be like this. In fact, any macro that could be routinely turned into a function without losing any functionality is a bad macro! Such a macro should have been a function to begin with.
I agree with #amalloy. If something is written as a macro, it probably does something that functions can't do (e.g., introduce bindings, change evaluation order). So automatically converting arbitrary macro into a function is a really bad idea even if it is possible.
Is it possible to write a function that would take arbitrary macro and turn it into a function so that it can be passed as an argument to another function?
No, but it is somewhat doable to write a macro that would take some macro and turn it into a function.
#lang racket
(require (for-syntax racket/list))
(define-syntax (->proc stx)
(syntax-case stx ()
[(_ mac #:arity arity)
(with-syntax ([(args ...) (generate-temporaries (range (syntax-e #'arity)))])
#'(λ (args ...) (mac args ...)))]))
((->proc and #:arity 2) 42 12)
(apply (->proc and #:arity 2) '(#f 12))
((->proc and #:arity 2) #f (error 'not-short-circuit))
You might also be interested in identifier macro, which allows us to use an identifier as a macro in some context and function in another context. This could be used to create a first class and/or which short-circuits when it's used as a macro, but could be passed as a function value in non-transformer position.
On the topic of first class macro, take a look at https://en.wikipedia.org/wiki/Fexpr. It's known to be a bad idea.
Not in the way you probably expect
To see why, here is a way of thinking about macros: A macro is a function which takes a bit of source code and turns it into another bit of source code: the expansion of the macro. In other words a macro is a function whose domain and range are source code.
Once the source code is fully expanded, then it's fed to either an evaluator or a compiler. Let's assume it's fed to a compiler because it makes the question easier to answer: a compiler itself is simply a function whose domain is source code and whose range is some sequence of instructions for a machine (which may or may not be a real machine) to execute. Those instructions might include things like 'call this function on these arguments'.
So, what you are asking is: can the 'this function' in 'call this function on these arguments' be some kind of macro? Well, yes, it could be, but whatever source code it is going to transform certainly can not be the source code of the program you are executing, because that is gone: all that's left is the sequence of instructions that was the return value of the compiler.
So you might say: OK, let's say we disallow compilers: can we do it now? Well, leaving aside that 'disallowing compilers' is kind of a serious limitation, this was, in fact, something that very old dialects of Lisp sort-of did, using a construct called a FEXPR, as mentioned in another answer. It's important to realise that FEXPRs existed because people had not yet invented macros. Pretty soon, people did invent macros, and although FEXPRs and macros coexisted for a while – mostly because people had written code which used FEXPRs which they wanted to keep running, and because writing macros was a serious pain before things like backquote existed – FEXPRs died out. And they died out because they were semantically horrible: even by the standards of 1960s Lisps they were semantically horrible.
Here's one small example of why FEXPRs are so horrible: Let's say I write this function in a language with FEXPRs:
(define (foo f g x)
(apply f (g x)))
Now: what happens when I call foo? In particular, what happens if f might be a FEXPR?. Well, the answer is that I can't compile foo at all: I have to wait until run-time and make some on-the-fly decision about what to do.
Of course this isn't what these old Lisps with FEXPRs probably did: they would just silently have assumed that f was a normal function (which they would have called an EXPR) and compiled accordingly (and yes, even very old Lisps had compilers). If you passed something which was a FEXPR you just lost: either the thing detected that, or more likely it fall over horribly or gave you some junk answer.
And this kind of horribleness is why macros were invented: macros provide a semantically sane approach to processing Lisp code which allows (eventually, this took a long time to actually happen) minor details like compilation being possible at all, code having reasonable semantics and compiled code having the same semantics as interpreted code. These are features people like in their languages, it turns out.
Incidentally, in both Racket and Common Lisp, macros are explicitly functions. In Racket they are functions which operate on special 'syntax' objects because that's how you get hygiene, but in Common Lisp, which is much less hygienic, they're just functions which operate on CL source code, where the source code is simply made up of lists, symbols &c.
Here's an example of this in Racket:
> (define foo (syntax-rules ()
[(_ x) x]))
> foo
#<procedure:foo>
OK, foo is now just an ordinary function. But it's a function whose domain & range are Racket source code: it expects a syntax object as an argument and returns another one:
> (foo 1)
; ?: bad syntax
; in: 1
; [,bt for context]
This is because 1 is not a syntax object.
> (foo #'(x 1))
#<syntax:readline-input:5:10 1>
> (syntax-e (foo #'(x 1)))
1
And in CL this is even easier to see: Here's a macro definition:
(defmacro foo (form) form)
And now I can get hold of the macro's function and call it on some CL source code:
> (macro-function 'foo)
#<Function foo 4060000B6C>
> (funcall (macro-function 'foo) '(x 1) nil)
1
In both Racket and CL, macros are, in fact, first-class (or, in the case of Racket: almost first-class, I think): they are functions which operate on source code, which itself is first-class: you can write Racket and CL programs which construct and manipulate source code in arbitrary ways: that's what macros are in these languages.
In the case of Racket I have said 'almost first-class', because I can't see a way, in Racket, to retrieve the function which sits behind a macro defined with define-syntax &c.
I've created something like this in Scheme, it's macro that return lambda that use eval to execute the macro:
(define-macro (macron m)
(let ((x (gensym)))
`(lambda (,x)
(eval `(,',m ,#,x)))))
Example usage:
;; normal eval
(define x (map (lambda (x)
(eval `(lambda ,#x)))
'(((x) (display x)) ((y) (+ y y)))))
;; using macron macro
(define x (map (macron lambda)
'(((x) (display x)) ((y) (+ y y)))))
and x in both cases is list of two functions.
another example:
(define-macro (+++ . args)
`(+ ,#args))
((macron +++) '(1 2 3))
I am trying to write a super-tiny Object-oriented system with syntax-rules, mostly just to learn it. Anyway, I am trying to introduce a "this" variable. Here is what I would like to be able to do:
(oo-class Counter
(
(attr value 0)
(attr skip 1)
)
(
(method (next) (set! value (+ value skip)) value)
(method (nextnext) (this 'next) (this 'next))
(method (set-value newval) (set! value newval))
(method (set-skip newskip) (set! skip newskip))
)
)
(define c (Counter))
((c 'set-value) 23)
((c 'next))
((c 'nextnext))
I can get everything to work except "this". It seems like syntax-rules doesn't allow variable introduction. I thought I could get it by defining it as one of the literals in syntax-rules, but this does not seem to work.
Below is my object-oriented system:
(define-syntax oo-class
(syntax-rules (attr method this)
(
(oo-class class-name
((attr attr-name initial-val) ...)
((method (meth-name meth-arg ...) body ...) ...))
(define class-name
(lambda ()
(letrec
(
(this #f)
(attr-name initial-val)
...
(funcmap
(list
(cons (quote meth-name) (cons (lambda (meth-arg ...) body ...) '()))
...
)
)
)
(set! this (lambda (methname)
(cadr (assoc methname funcmap))
))
this
)
)
)
)
)
)
This works for everything except 'nextnext, which errors out when it tries to reference "this".
Is this the right way to do this? Is there some other way to do this? I recognize that this is slightly unhygienic, but isn't that at least part of the point of specifying literals?
I've tried this in Chicken Scheme as well as DrRacket in R5RS mode (other modes get complainy about "this").
Below is the whole file. You can run it on Chicken with just "csi object.scm"
https://gist.github.com/johnnyb/211e105882248e892fa485327039cc90
I also tried to use let-syntax and use (this) as a syntax specifier to refer to the (this) variable. But, as far as I could tell, it wasn't letting me directly access a variable of my own making within the syntax rewriting.
BONUS QUESTION: What is an easy way to see the result of a syntax-rules transformation for debugging? Is there some way to get chicken (or something else) to do the transformation and spit out the result? I tried some stuff on DrRacket, but it doesn't work in R5RS mode.
I recognize that this is slightly unhygienic, but isn't that at least part of the point of specifying literals?
No, the literals exist so you can match literally on keywords, like for example the => or the else in a cond clause. It's still hygienic because if => or else is lexically bound to some value, that has precedence:
(let ((else #f))
(cond (else (display "hi!\n")))) ;; Will not print
Now, you could write a very tedious macro that matches this at any possible place and nesting level in the expansion, but that will never be complete, and it would not nest lexically, either.
It is possible to do what you're trying to do using what has become known as Petrofsky extraction, but it's a total and utter hack and abuse of syntax-rules and it does not work (consistently) in the presence of modules across implementations (for example, exactly in CHICKEN we've had a complaint that we accidentally "broke" this feature).
What I'd suggest is writing a syntax-rules macro that accepts an identifier in its input which will be bound to the current object, then write one trivial unhygienic macro that calls that other macro with the hardcoded identifier this as input.
What is an easy way to see the result of a syntax-rules transformation for debugging? Is there some way to get chicken (or something else) to do the transformation and spit out the result? I tried some stuff on DrRacket, but it doesn't work in R5RS mode.
In csi, you can use ,x (macro-call), but it will only do one level of expansion.
A common trick that works in every Scheme implementation is to change your macro definition to quote its output. So, it expands not to (foo) but to '(foo). That way, you can just call the macro in the REPL and see its result immediately.
*Note: Despite having frequented StackOverflow for a long time, this is the first question that I have posted myself. Apologies if it's a bit verbose. Constructive criticism appreciated.
When I define a struct in Common Lisp using defstruct, a predicate function is automatically generated that tests whether its argument is of the type defined by the defstruct. Eg:
(defstruct book
title
author)
(let ((huck-finn (make-book :title "The Adventures of Huckleberry Finn" :author "Mark Twain")))
(book-p huck-finn))
=> True
However, when defining a class using defclass, such functions are seemingly not generated by default (is there a way to specify this?), so I'm trying to add this functionality myself, because I'd like a) for this syntax to be consistent between structs and classes, b) to have an abbreviation of (typep obj 'classname), which I need to write very often and is visually noisy,
and c) as a programming exercise, since I'm still relatively new to Lisp.
I could write a macro that defines a predicate function given the name of a class:
(defclass book ()
((title :initarg :title
:accessor title)
(author :initarg :author
:accessor author)))
;This...
(defmacro gen-predicate (classname)
...)
;...should expand to this...
(defun book-p (obj)
(typep obj 'book))
;...when called like this:
(gen-predicate 'book)
The name that I need to pass to defun must be of the form 'classname-p. Here's where I have difficulty. To create such a symbol, I could use the "symb" function from Paul Graham's On Lisp (p. 58). When it is run on the REPL:
(symb 'book '-p)
=> BOOK-P
My gen-predicate macro looks like this so far:
(defmacro gen-predicate (classname)
`(defun ,(symb classname '-p) (obj)
(typep obj ,classname)))
(macroexpand `(gen-predicate 'book))
=>
(PROGN
(EVAL-WHEN (:COMPILE-TOPLEVEL) (SB-C:%COMPILER-DEFUN '|'BOOK-P| 'NIL T))
(SB-IMPL::%DEFUN '|'BOOK-P|
(SB-INT:NAMED-LAMBDA |'BOOK-P|
(OBJ)
(BLOCK |'BOOK-P| (TYPEP OBJ 'BOOK)))
NIL 'NIL (SB-C:SOURCE-LOCATION)))
T
It would seem that the symbol created by (symb 'book '-p) is actually considered |'BOOK-P| by the implementation (SBCL), not BOOK-P. Sure enough, this now works:
(let ((huck-finn (make-instance 'book)))
(|'BOOK-P| huck-finn))
=> True
Why is the symbol created by symb interned as |'BOOK-P|? In On Lisp (same page as above) Graham says: "Any string can be the print-name of a symbol, even a string containing lowercase letters or macro characters like parentheses. When a symbol's name contains such oddities, it is printed within vertical bars." No such oddities exist in this case, do they? And am I correct in thinking that the "print-name" of a symbol is what is actually displayed on the standard output when the symbol is printed, and is, in the case of such oddities, distinct from the form of the symbol itself?
To be able to write function-defining macros like gen-predicate - whose defined functions are named based on the arguments passed to the macro - seems to me like something that Lisp hackers have probably been doing for ages. User Kaz says here (Merging symbols in common lisp) that the "mashing-up" of symbols can often be avoided, but that would defeat the purpose of this macro.
Finally, assuming I could get gen-predicate to work how I want, what would be the best way of ensuring that it be called for each new class as they are defined? Much in the same way as initialize-instance can be customized to perform certain actions upon instantiation of a class, is there a generic function called by defclass that can perform actions upon definition of a class?
Thank you.
That's a usual problem: what gets passed to a Macro?
Compare calls like this:
(symb 'book '-p)
and
(symb ''book '-p)
Your macro form is this:
(gen-predicate 'book)
GEN-PREDICATE is a macro. classname is a parameter for this macro.
Now what is the value of classname inside the macro during code expansion? Is it book or 'book?
Actually it is the latter, because you wrote (gen-predicate 'book). Remember: macros see source code and the argument source gets passed to the macro function - not the value. The argument is 'book. Thus this gets passed. (QUOTE BOOK) is the same, only printed differently. So it is a two element list. The first element is the symbol QUOTE and the second element is the symbol BOOK.
Thus the macro now calls the function SYMB with the argument value (QUOTE BOOK) or, shorter, 'BOOK.
If you want to generate the predicate without the quote character, you need to write:
(gen-predicate book)
Alternatively you can also change the macro:
(symb classname '-p)
would be:
(symbol (if (and (consp classname)
(eq (first classname) 'quote))
(second classname)
classname))
Compare
We write
(defun foo () 'bar)
and not
(defun 'foo () 'bar) ; note the quoted FOO
DEFUN is a macro and the first argument is the function name. It's a similar problem then...
Second part of the question
I don't really know any good answer to that. I can't remember any easy way to run code (for example to define a function) after a class definition.
Maybe use the MOP, but that's ugly.
write a custom macro DEFINE-CLASS which does what you want: expands into DEFCLASS and the DEFUN.
iterate over all symbols in a package, find the classes and define the corresponding predicates
To address the second part of the question, classes are themselves objects, thanks to the MOP, so it might be possible to write an :after method on initialize-instance specialized on STANDARD-CLASS. But you should check the MOP to see whether defining such a method is allowed or not.
If it's possible, then yes, you can run code in response to the creation of a class; however, since you don't know the name of the class being created until runtime, you cannot spell it textually in the source, so you cannot use your macro (unless you use eval). You'd rather use something like
(let ((classname (class-name class)))
(compile (generate-my-predicate-symbol classname)
(lambda (x) (typep x classname))))
I think Rainer's suggestion to write your own DEFINE-CLASS macro is the way to go, I mean, the way a seasoned Lisper most likely would do it, if there aren't any other considerations at play. But I'm not really a seasoned Lisper, so I might be wrong ;)
I've written an ad hoc parser generator that creates code to convert an old and little known 7-bit character set into unicode. The call to the parser generator expands into a bunch of defuns enclosed in a progn, which then get compiled. I only want to expose one of the generated defuns--the top-level one--to the rest of the system; all the others are internal to the parser and only get called from within the dynamic scope of the top-level one. Therefore, the other defuns generated have uninterned names (created with gensym). This strategy works fine with SBCL, but I recently tested it for the first time with CLISP, and I get errors like:
*** - FUNCALL: undefined function #:G16985
It seems that CLISP can't handle functions with uninterned names. (Interestingly enough, the system compiled without a problem.) EDIT: It seems that it can handle functions with uninterned names in most cases. See the answer by Rörd below.
My questions is: Is this a problem with CLISP, or is it a limitation of Common Lisp that certain implementations (e.g. SBCL) happen to overcome?
EDIT:
For example, the macro expansion of the top-level generated function (called parse) has an expression like this:
(PRINC (#:G75735 #:G75731 #:G75733 #:G75734) #:G75732)
Evaluating this expression (by calling parse) causes an error like the one above, even though the function is definitely defined within the very same macro expansion:
(DEFUN #:G75735 (#:G75742 #:G75743 #:G75744) (DECLARE (OPTIMIZE (DEBUG 2)))
(DECLARE (LEXER #:G75742) (CONS #:G75743 #:G75744))
(MULTIPLE-VALUE-BIND (#:G75745 #:G75746) (POP-TOKEN #:G75742)
...
The two instances of #:G75735 are definitely the same symbol--not two different symbols with the same name. As I said, this works with SBCL, but not with CLISP.
EDIT:
SO user Joshua Taylor has pointed out that this is due to a long standing CLISP bug.
You don't show one of the lines that give you the error, so I can only guess, but the only thing that could cause this problem as far as I can see is that you are referring to the name of the symbol instead of the symbol itself when trying to call it.
If you were referring to the symbol itself, all your lisp implementation would have to do is lookup that symbol's symbol-function. Whether it's interned or not couldn't possibly matter.
May I ask why you haven't considered another way to hide the functions, i.e. a labels statement or defining the functions within a new package that exports only the one external function?
EDIT: The following example is copied literally from an interaction with the CLISP prompt.
As you can see, calling the function named by a gensym is working as expected.
[1]> (defmacro test ()
(let ((name (gensym)))
`(progn
(defun ,name () (format t "Hello!"))
(,name))))
TEST
[2]> (test)
Hello!
NIL
Maybe your code that's trying to call the function gets evaluated before the defun? If there's any code in the macro expansion besides the various defuns, it may be implementation-dependent what gets evaluated first, and so the behaviour of SBCL and CLISP may differ without any of them violating the standard.
EDIT 2: Some further investigation shows that CLISP's behaviour varies depending upon whether the code is interpreted directly or whether it's first compiled and then interpreted. You can see the difference by either directly loading a Lisp file in CLISP or by first calling compile-file on it and then loading the FASL.
You can see what's going on by looking at the first restart that CLISP offers. It says something like "Input a value to be used instead of (FDEFINITION '#:G3219)." So for compiled code, CLISP quotes the symbol and refers to it by name.
It seems though that this behaviour is standard-conforming. The following definition can be found in the HyperSpec:
function designator n. a designator for a function; that is, an object that denotes a function and that is one of: a symbol (denoting the function named by that symbol in the global environment), or a function (denoting itself). The consequences are undefined if a symbol is used as a function designator but it does not have a global definition as a function, or it has a global definition as a macro or a special form. See also extended function designator.
I think an uninterned symbol matches the "a symbol is used as a function designator but it does not have a global definition as a function" case for unspecified consequences.
EDIT 3: (I can agree that I'm not sure whether CLISP's behaviour is a bug or not. Someone more experienced with details of the standard's terminology should judge this. It comes down to whether the function cell of an uninterned symbol - i.e. a symbol that cannot be referred to by name, only by having a direct hold on the symbol object - would be considered a "global definition" or not)
Anyway, here's an example solution that solves the problem in CLISP by interning the symbols in a throwaway package, avoiding the matter of uninterned symbols:
(defmacro test ()
(let* ((pkg (make-package (gensym)))
(name (intern (symbol-name (gensym)) pkg)))
`(progn
(defun ,name () (format t "Hello!"))
(,name))))
(test)
EDIT 4: As Joshua Taylor notes in a comment to the question, this seems to be a case of the (10 year old) CLISP bug #180.
I've tested both workarounds suggested in that bug report and found that replacing the progn with locally actually doesn't help, but replacing it with let () does.
You can most certainly define functions whose names are uninterned symbols. For instance:
CL-USER> (defun #:foo (x)
(list x))
#:FOO
CL-USER> (defparameter *name-of-function* *)
*NAME-OF-FUNCTION*
CL-USER> *name-of-function*
#:FOO
CL-USER> (funcall *name-of-function* 3)
(3)
However, the sharpsign colon syntax introduces a new symbol each time such a form is read read:
#: introduces an uninterned symbol whose name is symbol-name. Every time this syntax is encountered, a distinct uninterned symbol is created. The symbol-name must have the syntax of a symbol with no package prefix.
This means that even though something like
CL-USER> (list '#:foo '#:foo)
;=> (#:FOO #:FOO)
shows the same printed representation, you actually have two different symbols, as the following demonstrates:
CL-USER> (eq '#:foo '#:foo)
NIL
This means that if you try to call such a function by typing #: and then the name of the symbol naming the function, you're going to have trouble:
CL-USER> (#:foo 3)
; undefined function #:foo error
So, while you can call the function using something like the first example I gave, you can't do this last one. This can be kind of confusing, because the printed representation makes it look like this is what's happening. For instance, you could write such a factorial function like this:
(defun #1=#:fact (n &optional (acc 1))
(if (zerop n) acc
(#1# (1- n) (* acc n))))
using the special reader notation #1=#:fact and #1# to later refer to the same symbol. However, look what happens when you print that same form:
CL-USER> (pprint '(defun #1=#:fact (n &optional (acc 1))
(if (zerop n) acc
(#1# (1- n) (* acc n)))))
(DEFUN #:FACT (N &OPTIONAL (ACC 1))
(IF (ZEROP N)
ACC
(#:FACT (1- N) (* ACC N))))
If you take that printed output, and try to copy and paste it as a definition, the reader creates two symbols named "FACT" when it comes to the two occurrences of #:FACT, and the function won't work (and you might even get undefined function warnings):
CL-USER> (DEFUN #:FACT (N &OPTIONAL (ACC 1))
(IF (ZEROP N)
ACC
(#:FACT (1- N) (* ACC N))))
; in: DEFUN #:FACT
; (#:FACT (1- N) (* ACC N))
;
; caught STYLE-WARNING:
; undefined function: #:FACT
;
; compilation unit finished
; Undefined function:
; #:FACT
; caught 1 STYLE-WARNING condition
I hope I get the issue right. For me it works in CLISP.
I tried it like this: using a macro for creating a function with a GENSYM-ed name.
(defmacro test ()
(let ((name (gensym)))
`(progn
(defun ,name (x) (* x x))
',name)))
Now I can get the name (setf x (test)) and call it (funcall x 2).
Yes, it is perfectly fine defining functions that have names that are unintenred symbols. The problem is that you cannot then call them "by name", since you can't fetch the uninterned symbol by name (that is what "uninterned" means, essentially).
You would need to store the uninterned symbol in some sort of data structure, to then be able to fetch the symbol. Alternatively, store the defined function in some sort of data structure.
Surprisingly, CLISP bug 180 isn't actually an ANSI CL conformance bug. Not only that, but evidently, ANSI Common Lisp is itself so broken in this regard that even the progn based workaround is a courtesy of the implementation.
Common Lisp is a language intended for compilation, and compilation produces issues regarding the identity of objects which are placed into compiled files and later loaded ("externalized" objects). ANSI Common Lisp requires that literal objects reproduced from compiled files are only similar to the original objects. (CLHS 3.2.4 Literal Objects in Compiled Files).
Firstly, according to the definition similarity (3.2.4.2.2 Definition of Similarity), the rules for uninterned symbols is that similarity is name based. If we compile code with a literal that contains an uninterned symbol, then when we load the compiled file, we get a symbol which is similar and not (necessarily) the same object: a symbol which has the same name.
What if the same uninterned symbol is inserted into two different top-level forms which are then compiled as a file? When the file is loaded, are those two similar to each other at least? No, there is no such requirement.
But it gets worse: there is also no requirement that two occurrences of the same uninterned symbol in the same form will be externalized in such a way that their relative identity is preserved: that the re-loaded version of that object will have the same symbol object in all the places where the original was. In fact, the definition of similarity contains no provision for preserving the circular structure and substructure sharing. If we have a literal like '#1=(a b . #1#), as a literal in a compiled file, there appears to be no requirement that this be reproduced as a circular object with the same graph structure as the original (a graph isomorphism). The similarity rule for conses is given as naive recursion: two conses are similar if their respective cars and cdrs are similar. (The rule can't even be evaluated for circular objects; it doesn't terminate).
That the above works is because of implementations going beyond what is required in the spec; they are providing an extension consistent with (3.2.4.3 Extensions to Similarity Rules).
Thus, purely according to ANSI CL, we cannot expect to use macros with gensyms in compiled files, at least in some ways. The expectation expressed in code like the following runs afoul of the spec:
(defmacro foo (arg)
(let ((g (gensym))
(literal '(blah ,g ,g ,arg)))
...))
(defun bar ()
(foo 42))
The bar function contains a literal with two insertions of a gensym, which according to the similarity rules for conses and symbols need not reproduce as a list containing two occurrences of the same object in the second and third positions.
If the above works as expected, it's due to "extensions to the similarity rules".
So the answer to the "Why can't CLISP ..." question is that although CLISP does provide an extension for similarity which preserves the graph structure of literal forms, it doesn't do it across the entire compiled file, only within individual top level items within that file. (It uses *print-circle* to emit the individual items.) The bug is that CLISP doesn't conform to the best possible behavior users can imagine, or at least to a better behavior exhibited by other implementations.
I'm a bit confused as to exactly when symbol capture will occur with clojure macros. Suppose that I have a macro which defines a function from keywords. In this trivial example,
(defmacro foo [keywd1 keywd2] `(defn ~(symbol (name keywd1))
[~(symbol (name keywd2))] (* 2 ~(symbol (name keywd2)))))
I call (foo :bar :baz), and this gets expanded into (defn bar [baz] (* 2 baz)).
So now the question -- can this lead to symbol capture? If so, under what circumstances?
I know that it's preferred to use gensym (e.g. bar#) to prevent symbol capture, but in some cases (not many, but still) I'd like to have a pretty macro-expansion, without the auto-generated symbols.
Bonus question: does the answer change if we are considering a macro that creates macros?
In your example symbol capture does not happen, because provide the variable parts as parameters. So the developer can choose the names himself.
Symbol capture happens when your macro introduces new locals which are not specified by the user. Consider the following (really silly and nonsensical just to show the point) example:
(defmacro foo
[name & body]
`(defn ~name
[~'bar]
(println ~'bar)
~#body))
In this case bar is captured. Now suppose the user has some code like this.
(def bar 5)
(foo baz (* 2 bar))
(baz 7)
This would not give what the would expect. Because the global bar, the user refering to gets shadowed by the local bar introduced by the macro. As you already said: in this case one should use bar# to introduce the local.
So capture is always indicated by ~'. Macro writing macros don't really change that. Just add one more level: ~~'.