Common Lisp symbol equality - lisp

In most of the examples on the Internet, symbol equality is straight-forward:
(eq 'sym 'sym)
t
In my program, I want to compare symbols in a custom package:
(defpackage #:my-package
(:use #:common-lisp)
(:export #:my-function))
(in-package #:my-package)
(defun my-function (value)
(cond
((eq value 'sym)
(format t "SYM: YES~%"))
(t
(format t "SYM: NO~%"))))
(in-package #:common-lisp)
(my-package:my-function 'sym)
But when this code is executed, it displays:
SYM: NO
It seems like the 2 symbols are different:
(eq 'sym 'my-package::sym)
nil
The reason seems easy to understand, a symbol interned in a given package is not equal to a symbol with the same name interned in another package. Fair enough! But what is the idiom to compare 2 symbols, regardless the package?
Should we convert it to a string in a first place and compare strings?
(defpackage #:my-package
(:use #:common-lisp)
(:export #:my-function))
(in-package #:my-package)
(defun my-function (value)
(cond
((string= (symbol-name value) "SYM")
(format t "SYM: YES~%"))
(t
(format t "SYM: NO~%"))))
(in-package #:common-lisp)
(my-package:my-function 'sym)
The result is better, but there is an obvious issue regarding the character case.
How to check that 2 symbols are the same regardless their package?

The usual idiom is string=, which compares the names of symbols without regard to identity.
For example:
(eq 'x:a 'y:a) => nil
(string= 'x:a 'y:a) => t

I think you are confused about what 'being the same symbol' means.
Two symbols are the same symbol iff they are eq, which means that if they are not eq they are not the same symbol. In particular if two symbols have home packages which are different (again: not eq) they are not the same symbol.
If you want to know if two symbols have the same name then simply compare their symbol-names, as strings. But two symbols which merely have the same name are not necessarily the same symbol (for instance (eq '#:foo '#:foo) is false).
There are cases where it is useful to know whether two symbols have the same name (for instance the loop macro must do this so that (loop for ...) and (loop :for ...) mean the same thing) but these cases are fairly rare.
If what you want to do is know if two symbols have the same names you quite likely should be using strings instead.
If what you actually want to know is whether two symbols accessed from different packages are really the same symbol (so (eq 'x:foo 'y:foo), say) then eq is the appropriate test, and understanding the package system also helps.

Related

sublis not substituting symbols when called from another package

In my function I am reading input from the user which is expected to be a lisp form given as a string e.g.:
(sym1 sym2 (sym3 sym4))
My goal is to substitute some of the symbols with other symbols e.g.:
(sublis '((sym1 . sym1%)
(sym2 . sym2%))
str)
Because I am getting the input as a string, I am first converting it to a lisp form. Here is how the final function looks like:
(defun sublis-when-string (str)
(sublis '((sym1 . sym1%)
(sym2 . sym2%))
(read-from-string str)))
When I compile the function and run it in the REPL with (sublis-when-string "(sym1 sym3 (sym2 sym4))") I correctly get:
(SYM1% SYM3 (SYM2% SYM4))
However when I run the whole program the substitutions do not work:
(SYM1 SYM3 (SYM2 SYM4))
This lead me to believe that the problems is with the package. When I changed the package in the REPL the substitutions were still not working.
My question is: How should I change my function so it works when called from other packages?
If you want to use your function independently from the package in which the symbols are read, you can change your definition by adding an explicit test: when the values to be checked are symbols, the string= operator is used instead of default eql. For instance:
(defun sublis-when-string (str)
(sublis '((sym1 . sym1%)
(sym2 . sym2%))
(read-from-string str)
:test (lambda (x y)
(if (and (symbolp x) (symbolp y))
(string= x y)
(eql x y)))))
See the definition of sublis.
You can define a package for your user:
(defpackage :my-user (:use) (:export #:sym1 #:sym2))
Of course, the exported symbols are the one you need to add in your substitution list. And then, you bind the *package* variable before reading:
(let ((*package* (find-package :my-user)))
(read-from-string string))
Notice that all the symbols will be read from the :my-user package.
Depending on how much you trust the source of your strings, you can also set *read-eval* to nil and define a minimalistic readtable too:
disable array notation that allocate very large arrays like #n() (bad user exhausting memory on purpose)
make : a terminating character to make qualified symbols throw an error (bad user interning symbols in other packages)

Why does lisp use gensym and other languages don't?

Correct me if I'm wrong, but there is nothing like gensym in Java, C, C++, Python, Javascript, or any of the other languages I've used, and I've never seemed to need it. Why is it necessary in Lisp and not in other langauges? For clarification, I'm learning Common Lisp.
Common Lisp has a powerful macro system. You can make new syntax patterns that behave exactly the way you want them to behave. It's even expressed in its own language, making everything in the language available to transform the code from what you want to write to something that CL actually understands. All languages with powerful macro systems provide gensym or do it implicitly in their macro implementation.
In Common Lisp you use gensym when you want to make code where the symbol shouldn't match elements used any other places in the result. Without it there is no guarantee that a user uses a symbol that the macro implementer also use and they start to interfere and the result is something different than the intended behavior. It makes sure nested expansions of the same macro don't interfere with previous expansions. With the Common Lisp macro system it's possible to make more restrictive macro systems similar to Scheme syntax-rules and syntax-case.
In Scheme there are several macro systems. One with pattern matching where new introduced symbols act automatically as if they are made with gensym. syntax-case will also by default make new symbols as if they were made with gensym and there is also a way to reduce hygiene. You can make CL defmacro with syntax-case but since Scheme doesn't have gensym you wouldn't be able to make hygienic macros with it.
Java, C, C++, Python, Javascript are all Algol dialects and none of them have other than simple template based macros. Thus they don't have gensym because they don't need it. Since the only way to introduce new syntax in these languages is to wish next version of it will provide it.
There are two Algol dialects with powerful macros that come to mind. Nemerle and Perl6. Both of them have hygienic approach, meaning variables introduced behave as if they are made with gensym.
In CL, Scheme, Nemerle, Perl6 you don't need to wait for language features. You can make them yourself! The news in both Java and PHP are easily implemented with macros in any of them should it not already be available.
Can't say which languages have an equivalent of GENSYM. Many languages don't have a first-class symbol data type (with interned and uninterned symbols) and many are not providing similar code generation (macros, ...) facilities.
An interned symbol is registered in a package. An uninterned is not. If the reader (the reader is the Lisp subsystem which takes textual s-expressions as input and returns data) sees two interned symbols in the same package and with the same name, it assumes that it is the same symbol:
CL-USER 35 > (eq 'cl:list 'cl:list)
T
If the reader sees an uninterned symbol, it creates a new one:
CL-USER 36 > (eq '#:list '#:list)
NIL
Uninterned symbols are written with #: in front of the name.
GENSYM is used in Lisp to create numbered uninterned symbols, because it is sometimes useful in code generation and then debugging this code. Note that the symbols are always new and not eq to anything else. But the symbol name could be the same as the name of another symbol. The number gives a clue to the human reader about the identity.
An example using MAKE-SYMBOL
make-symbol creates a new uninterned symbol using a string argument as its name.
Let's see this function generating some code:
CL-USER 31 > (defun make-tagbody (exp test)
(let ((start-symbol (make-symbol "start"))
(exit-symbol (make-symbol "exit")))
`(tagbody ,start-symbol
,exp
(if ,test
(go ,start-symbol)
(go ,exit-symbol))
,exit-symbol)))
MAKE-TAGBODY
CL-USER 32 > (pprint (make-tagbody '(incf i) '(< i 10)))
(TAGBODY
#:|start| (INCF I)
(IF (< I 10) (GO #:|start|) (GO #:|exit|))
#:|exit|)
Above generated code uses uninterned symbols. Both #:|start| are actually the same symbol. We would see this if we would have *print-circle* to T, since the printer then would clearly label identical objects. But here we don't get this added information. Now if you nest this code, then you would see more than the one start and one exit symbol, each which was used in two places.
An example using GENSYM
Now let's use gensym. Gensym also creates an uninterned symbol. Optionally this symbol is named by a string. A number (see the variable CL:*GENSYM-COUNTER*) is added.
CL-USER 33 > (defun make-tagbody (exp test)
(let ((start-symbol (gensym "start"))
(exit-symbol (gensym "exit")))
`(tagbody ,start-symbol
,exp
(if ,test
(go ,start-symbol)
(go ,exit-symbol))
,exit-symbol)))
MAKE-TAGBODY
CL-USER 34 > (pprint (make-tagbody '(incf i) '(< i 10)))
(TAGBODY
#:|start213051| (INCF I)
(IF (< I 10) (GO #:|start213051|) (GO #:|exit213052|))
#:|exit213052|)
Now the number is an indicator that the two uninterned #:|start213051| symbols are actually the same. When the code would be nested, the new version of the start symbol would have a different number:
CL-USER 7 > (pprint (make-tagbody `(progn
(incf i)
(setf j 0)
,(make-tagbody '(incf ij) '(< j 10)))
'(< i 10)))
(TAGBODY
#:|start2756| (PROGN
(INCF I)
(SETF J 0)
(TAGBODY
#:|start2754| (INCF IJ)
(IF (< J 10)
(GO #:|start2754|)
(GO #:|exit2755|))
#:|exit2755|))
(IF (< I 10) (GO #:|start2756|) (GO #:|exit2757|))
#:|exit2757|)
Thus it helps understanding generated code, without the need to turn *print-circle* on, which would label the identical objects:
CL-USER 8 > (let ((*print-circle* t))
(pprint (make-tagbody `(progn
(incf i)
(setf j 0)
,(make-tagbody '(incf ij) '(< j 10)))
'(< i 10))))
(TAGBODY
#3=#:|start1303| (PROGN
(INCF I)
(SETF J 0)
(TAGBODY
#1=#:|start1301| (INCF IJ)
(IF (< J 10) (GO #1#) (GO #2=#:|exit1302|))
#2#))
(IF (< I 10) (GO #3#) (GO #4=#:|exit1304|))
#4#)
Above is readable for the Lisp reader (the subsystem which reads s-expressions for textual representations), but a bit less for the human reader.
I believe that symbols (in the Lisp sense) are mostly useful in homoiconic languages (those in which the syntax of the language is representable as a data of that language).
Java, C, C++, Python, Javascript are not homoiconic.
Once you have symbols, you want some way to dynamically create them. gensym is a possibility, but you can also intern them.
BTW, MELT is a lisp-like dialect, it does not create symbols with gensym or by interning strings but with clone_symbol. (actually MELT symbols are instances of predefined CLASS_SYMBOL, ...).
gensym is available as a predicate in most of Prolog interpreters. You can find it in the eponym library.

make-symbol added unwanted formatting to generated symbols

I'm trying to write a function primeify that accepts a symbol and returns the symbol with "-prime" appended to it. My desired output is:
[1] > (primeify 'y)
Y-PRIME
(or y-prime, the case isn't the main point here, although it may become relevant later).
Here's my current (erroneous) implementation:
(defun primeify (sym)
(make-symbol (concatenate 'string (string sym) "-prime")))
However, make-symbol is mangling the output by cluttering it with additional formatting. My output is:
[1]> (primeify 'y)
#:|Y-prime|
Is there any way to avoid this additional processing done by make-symbol, or another function I could use to accomplish this? Is this even possible to accomplish in lisp?
Your symbol:
#:|Y-prime|
This is a non-interned symbol. It is in no package. #: is in front of the symbol because of that.
It is also a symbol name with mixed case. Because of that it is escaped with vertical bars. Remember, by default symbol names are internally stored in UPPERCASE.
(defun primeify (sym)
(let ((name (concatenate 'string
(string sym)
"-PRIME")))
(if (symbol-package sym)
(intern name (symbol-package sym))
(make-symbol name))))
Above function gives the new symbol the same package as the original symbol has, if any.
CL-USER 3 > (primeify 'foo)
FOO-PRIME
NIL
CL-USER 4 > (primeify '#:foo)
#:FOO-PRIME

How do I define a function that creates a function alias?

The Lisp forum thread Define macro alias? has an example of creating function alias using a form such as
(setf (symbol-function 'zero?) #'zerop)
This works fine, making zero? a valid predicate. Is it possible to parametrize this form without resorting to macros? I'd like to be able to call the following and have it create function?:
(define-predicate-alias 'functionp)`
My take was approximately:
(defun defalias (old new)
(setf (symbol-function (make-symbol new))
(symbol-function old)))
(defun define-predicate-alias (predicate-function-name)
(let ((alias (format nil "~A?" (string-right-trim "-pP" predicate-function-name))))
(defalias predicate-function-name alias)))
(define-predicate-alias 'zerop)
(zero? '())
This fails when trying to call zero? saying
The function COMMON-LISP-USER::ZERO? is undefined.
make-symbol creates an uninterned symbol. That's why zero? is undefined.
Replace your (make-symbol new) with e.g. (intern new *package*). (Or you may want to think more carefully in which package to intern your new symbol.)
Your code makes a symbol, via MAKE-SYMBOL, but you don't put it into a package.
Use the function INTERN to add a symbol to a package.
To expand on Lars' answer, choose the right package. In this case the default might be to use the same package from the aliased function:
About style:
Anything that begins with DEF should actually be a macro. If you have a function, don't use a name beginning with "DEF". If you look at the Common Lisp language, all those are macro. For example: With those defining forms, one would typically expect that they have a side-effect during compilation of files: the compiler gets informed about them. A function can't.
If I put something like this in a file
(define-predicate-alias zerop)
(zero? '())
and then compile the file, I would expect to not see any warnings about an undefined ZERO?. Thus a macro needs to expand (define-predicate-alias 'zerop) into something which makes the new ZERO? known into the compile-time environment.
I would also make the new name the first argument.
Thus use something like MAKE-PREDICATE-ALIAS instead of DEFINE-PREDICATE-ALIAS, for the function.
There are already some answers that explain how you can do this, but I'd point out:
Naming conventions, P, and -P
Common Lisp has a naming convention that is mostly adhered to (there are exceptions, even in the standard library), that if a type name is multiple words (contains a -), then its predicate is named with -P suffix, whereas if it doesn't, the suffix is just P. So we'd have keyboardp and lcd-monitor-p. It's good then, that you're using (string-right-trim "-pP" predicate-function-name)), but since the …P and …-P names in the standard, and those generated by, e.g., defstruct, will be using P, not p, you might just use (string-right-trim "-P" predicate-function-name)). Of course, even this has the possible issues with some names (e.g., pop), but I guess that just comes with the territory.
Symbol names, format, and *print-case*
More importantly, using format to create symbol names for subsequent interning is dangerous, because format doesn't always print a symbol's name with the characters in the same case that they actually appear in its name. E.g.,
(let ((*print-case* :downcase))
(list (intern (symbol-name 'foo))
(intern (format nil "~A" 'foo))))
;=> (FOO |foo|) ; first symbol has name "FOO", second has name "foo"
You may be better off using string concatenation and extracting symbol names directly. This means you could write code like (this is slightly different use case, since the other questions already explain how you can do what you're trying to do):
(defmacro defpredicate (symbol)
(flet ((predicate-name (symbol)
(let* ((name (symbol-name symbol))
(suffix (if (find #\- name) "-P" "P")))
(intern (concatenate 'string name suffix)))))
`(defun ,(predicate-name symbol) (x)
(typep x ',symbol)))) ; however you're checking the type
(macroexpand-1 '(defpredicate zero))
;=> (DEFUN ZEROP (X) (TYPEP X 'ZERO))
(macroexpand-1 '(defpredicate lcd-monitor))
;=> (DEFUN LCD-MONITOR-P (X) (TYPEP X 'LCD-MONITOR))

Why can't CLISP call certain functions with uninterned names?

I've written an ad hoc parser generator that creates code to convert an old and little known 7-bit character set into unicode. The call to the parser generator expands into a bunch of defuns enclosed in a progn, which then get compiled. I only want to expose one of the generated defuns--the top-level one--to the rest of the system; all the others are internal to the parser and only get called from within the dynamic scope of the top-level one. Therefore, the other defuns generated have uninterned names (created with gensym). This strategy works fine with SBCL, but I recently tested it for the first time with CLISP, and I get errors like:
*** - FUNCALL: undefined function #:G16985
It seems that CLISP can't handle functions with uninterned names. (Interestingly enough, the system compiled without a problem.) EDIT: It seems that it can handle functions with uninterned names in most cases. See the answer by Rörd below.
My questions is: Is this a problem with CLISP, or is it a limitation of Common Lisp that certain implementations (e.g. SBCL) happen to overcome?
EDIT:
For example, the macro expansion of the top-level generated function (called parse) has an expression like this:
(PRINC (#:G75735 #:G75731 #:G75733 #:G75734) #:G75732)
Evaluating this expression (by calling parse) causes an error like the one above, even though the function is definitely defined within the very same macro expansion:
(DEFUN #:G75735 (#:G75742 #:G75743 #:G75744) (DECLARE (OPTIMIZE (DEBUG 2)))
(DECLARE (LEXER #:G75742) (CONS #:G75743 #:G75744))
(MULTIPLE-VALUE-BIND (#:G75745 #:G75746) (POP-TOKEN #:G75742)
...
The two instances of #:G75735 are definitely the same symbol--not two different symbols with the same name. As I said, this works with SBCL, but not with CLISP.
EDIT:
SO user Joshua Taylor has pointed out that this is due to a long standing CLISP bug.
You don't show one of the lines that give you the error, so I can only guess, but the only thing that could cause this problem as far as I can see is that you are referring to the name of the symbol instead of the symbol itself when trying to call it.
If you were referring to the symbol itself, all your lisp implementation would have to do is lookup that symbol's symbol-function. Whether it's interned or not couldn't possibly matter.
May I ask why you haven't considered another way to hide the functions, i.e. a labels statement or defining the functions within a new package that exports only the one external function?
EDIT: The following example is copied literally from an interaction with the CLISP prompt.
As you can see, calling the function named by a gensym is working as expected.
[1]> (defmacro test ()
(let ((name (gensym)))
`(progn
(defun ,name () (format t "Hello!"))
(,name))))
TEST
[2]> (test)
Hello!
NIL
Maybe your code that's trying to call the function gets evaluated before the defun? If there's any code in the macro expansion besides the various defuns, it may be implementation-dependent what gets evaluated first, and so the behaviour of SBCL and CLISP may differ without any of them violating the standard.
EDIT 2: Some further investigation shows that CLISP's behaviour varies depending upon whether the code is interpreted directly or whether it's first compiled and then interpreted. You can see the difference by either directly loading a Lisp file in CLISP or by first calling compile-file on it and then loading the FASL.
You can see what's going on by looking at the first restart that CLISP offers. It says something like "Input a value to be used instead of (FDEFINITION '#:G3219)." So for compiled code, CLISP quotes the symbol and refers to it by name.
It seems though that this behaviour is standard-conforming. The following definition can be found in the HyperSpec:
function designator n. a designator for a function; that is, an object that denotes a function and that is one of: a symbol (denoting the function named by that symbol in the global environment), or a function (denoting itself). The consequences are undefined if a symbol is used as a function designator but it does not have a global definition as a function, or it has a global definition as a macro or a special form. See also extended function designator.
I think an uninterned symbol matches the "a symbol is used as a function designator but it does not have a global definition as a function" case for unspecified consequences.
EDIT 3: (I can agree that I'm not sure whether CLISP's behaviour is a bug or not. Someone more experienced with details of the standard's terminology should judge this. It comes down to whether the function cell of an uninterned symbol - i.e. a symbol that cannot be referred to by name, only by having a direct hold on the symbol object - would be considered a "global definition" or not)
Anyway, here's an example solution that solves the problem in CLISP by interning the symbols in a throwaway package, avoiding the matter of uninterned symbols:
(defmacro test ()
(let* ((pkg (make-package (gensym)))
(name (intern (symbol-name (gensym)) pkg)))
`(progn
(defun ,name () (format t "Hello!"))
(,name))))
(test)
EDIT 4: As Joshua Taylor notes in a comment to the question, this seems to be a case of the (10 year old) CLISP bug #180.
I've tested both workarounds suggested in that bug report and found that replacing the progn with locally actually doesn't help, but replacing it with let () does.
You can most certainly define functions whose names are uninterned symbols. For instance:
CL-USER> (defun #:foo (x)
(list x))
#:FOO
CL-USER> (defparameter *name-of-function* *)
*NAME-OF-FUNCTION*
CL-USER> *name-of-function*
#:FOO
CL-USER> (funcall *name-of-function* 3)
(3)
However, the sharpsign colon syntax introduces a new symbol each time such a form is read read:
#: introduces an uninterned symbol whose name is symbol-name. Every time this syntax is encountered, a distinct uninterned symbol is created. The symbol-name must have the syntax of a symbol with no package prefix.
This means that even though something like
CL-USER> (list '#:foo '#:foo)
;=> (#:FOO #:FOO)
shows the same printed representation, you actually have two different symbols, as the following demonstrates:
CL-USER> (eq '#:foo '#:foo)
NIL
This means that if you try to call such a function by typing #: and then the name of the symbol naming the function, you're going to have trouble:
CL-USER> (#:foo 3)
; undefined function #:foo error
So, while you can call the function using something like the first example I gave, you can't do this last one. This can be kind of confusing, because the printed representation makes it look like this is what's happening. For instance, you could write such a factorial function like this:
(defun #1=#:fact (n &optional (acc 1))
(if (zerop n) acc
(#1# (1- n) (* acc n))))
using the special reader notation #1=#:fact and #1# to later refer to the same symbol. However, look what happens when you print that same form:
CL-USER> (pprint '(defun #1=#:fact (n &optional (acc 1))
(if (zerop n) acc
(#1# (1- n) (* acc n)))))
(DEFUN #:FACT (N &OPTIONAL (ACC 1))
(IF (ZEROP N)
ACC
(#:FACT (1- N) (* ACC N))))
If you take that printed output, and try to copy and paste it as a definition, the reader creates two symbols named "FACT" when it comes to the two occurrences of #:FACT, and the function won't work (and you might even get undefined function warnings):
CL-USER> (DEFUN #:FACT (N &OPTIONAL (ACC 1))
(IF (ZEROP N)
ACC
(#:FACT (1- N) (* ACC N))))
; in: DEFUN #:FACT
; (#:FACT (1- N) (* ACC N))
;
; caught STYLE-WARNING:
; undefined function: #:FACT
;
; compilation unit finished
; Undefined function:
; #:FACT
; caught 1 STYLE-WARNING condition
I hope I get the issue right. For me it works in CLISP.
I tried it like this: using a macro for creating a function with a GENSYM-ed name.
(defmacro test ()
(let ((name (gensym)))
`(progn
(defun ,name (x) (* x x))
',name)))
Now I can get the name (setf x (test)) and call it (funcall x 2).
Yes, it is perfectly fine defining functions that have names that are unintenred symbols. The problem is that you cannot then call them "by name", since you can't fetch the uninterned symbol by name (that is what "uninterned" means, essentially).
You would need to store the uninterned symbol in some sort of data structure, to then be able to fetch the symbol. Alternatively, store the defined function in some sort of data structure.
Surprisingly, CLISP bug 180 isn't actually an ANSI CL conformance bug. Not only that, but evidently, ANSI Common Lisp is itself so broken in this regard that even the progn based workaround is a courtesy of the implementation.
Common Lisp is a language intended for compilation, and compilation produces issues regarding the identity of objects which are placed into compiled files and later loaded ("externalized" objects). ANSI Common Lisp requires that literal objects reproduced from compiled files are only similar to the original objects. (CLHS 3.2.4 Literal Objects in Compiled Files).
Firstly, according to the definition similarity (3.2.4.2.2 Definition of Similarity), the rules for uninterned symbols is that similarity is name based. If we compile code with a literal that contains an uninterned symbol, then when we load the compiled file, we get a symbol which is similar and not (necessarily) the same object: a symbol which has the same name.
What if the same uninterned symbol is inserted into two different top-level forms which are then compiled as a file? When the file is loaded, are those two similar to each other at least? No, there is no such requirement.
But it gets worse: there is also no requirement that two occurrences of the same uninterned symbol in the same form will be externalized in such a way that their relative identity is preserved: that the re-loaded version of that object will have the same symbol object in all the places where the original was. In fact, the definition of similarity contains no provision for preserving the circular structure and substructure sharing. If we have a literal like '#1=(a b . #1#), as a literal in a compiled file, there appears to be no requirement that this be reproduced as a circular object with the same graph structure as the original (a graph isomorphism). The similarity rule for conses is given as naive recursion: two conses are similar if their respective cars and cdrs are similar. (The rule can't even be evaluated for circular objects; it doesn't terminate).
That the above works is because of implementations going beyond what is required in the spec; they are providing an extension consistent with (3.2.4.3 Extensions to Similarity Rules).
Thus, purely according to ANSI CL, we cannot expect to use macros with gensyms in compiled files, at least in some ways. The expectation expressed in code like the following runs afoul of the spec:
(defmacro foo (arg)
(let ((g (gensym))
(literal '(blah ,g ,g ,arg)))
...))
(defun bar ()
(foo 42))
The bar function contains a literal with two insertions of a gensym, which according to the similarity rules for conses and symbols need not reproduce as a list containing two occurrences of the same object in the second and third positions.
If the above works as expected, it's due to "extensions to the similarity rules".
So the answer to the "Why can't CLISP ..." question is that although CLISP does provide an extension for similarity which preserves the graph structure of literal forms, it doesn't do it across the entire compiled file, only within individual top level items within that file. (It uses *print-circle* to emit the individual items.) The bug is that CLISP doesn't conform to the best possible behavior users can imagine, or at least to a better behavior exhibited by other implementations.