Merging symbols in common lisp - merge

I want to insert a char into a list. However, I want to merge this char with the last symbol in the list. With appends and cons the result is always two different symbols. Well, I want one merged symbol to be my result.
Example:
(XXXX 'a '5) ====> (a5)
What I would like to have, instead of:
(XXXX 'a '5) ====> (a 5)

You cannot "merge symbols" in Lisp.
First of all, 5 is not a symbol, but a number. If you want a symbol named "5" you have to type it as |5| (for example).
If a function takes the symbol A and symbol |5|, and produces the symbol A5, it has not merged symbols. It has created a new symbol whose name is the catenation of the names of those input symbols.
Properly designed Lisp programs rarely depend on how a symbol is named. They depend on symbols being unique entities.
If you're using symbols to identify things, and both 5 and A identify some entity, the best answer isn't necessarily to create a new symbol which is, in name at least, is a mashup of these two symbols. For instance, a better design might be to accept that names are multi-faceted or compound in some way. Perhaps the list (A 5) can serve as a name.
Common Lisp functions themselves can have compound names. For instance (setf foo) is a function name. Aggregates like lists can be names.
If you simply need the machine to invent unique symbols at run-time, consider using the gensym function. You can pass your own prefix to it:
(gensym "FOO") -> #:FOO0042
Of course, the prefix can be the name of some existing symbol, pulled out via symbol-name. The symbol #:FOO0042 is not unique because of the 0042 but because it is a freshly allocated object in the address space. The #: means it is not interned in any package. The name of the symbol is FOO0042.
If you still really want to, a simple way to take the printed representation of a bunch of input objects and turn it into a symbol is this:
(defun mashup-symbol (&rest objects)
(intern (format nil "~{~a~}" objects)))
Examples:
(mashup-symbol 1 2 3) -> |123|
(mashup-symbol '(a b) 'c 3) -> |(A B)C3|

Define this:
(defun symbol-append (&rest symbols)
(intern (apply #'concatenate 'string
(mapcar #'symbol-name symbols))))
and then use it as:
* (symbol-append 'a 'b 'c)
ABC
* (apply #'symbol-append '(a b c))
ABC
If you expect your arguments to contain stuff besides symbols, then replace symbol-name with:
(lambda (x)
(typecase x ...))
or a pre-existing CL function (that I've forgotten :() that stringifies anything.

The answer to the question you ask is
(defun concatenate-objects-to-symbol (&rest objects)
(intern (apply #'concatenate 'string (mapcar #'princ-to-string objects))))
(concatenate-objects 'a 'b) ==> ab
Oh, if you want a list as the result:
(defun xxxx (s1 s2) (list (concatenate-objects-to-symbol s1 s2)))
However, I am pretty sure this is not the question you actually want to ask.
Creating new symbols programmatically is not something beginners should be doing...

Related

What is atom in LISP?

I would like to have a clear understanding, what is 'Atom' in LISP?
Due to lispworks, 'atom - any object that is not a cons.'.
But this definition is not enough clear for me.
For example, in the code below:
(cadr
(caddar (cddddr L)))
Is 'L' an atom? On the one hand, L is not an atom, because it is cons, because it is the list (if we are talking about object, which is associated with the symbol L).
On the other hand, if we are talking about 'L' itself (not about its content, but about the symbol 'L'), it is an atom, because it is not a cons.
I've tried to call function 'atom',
(atom L) => NIL
(atom `L) => T
but still I have no clue... Please, help!
So the final question: in the code above, 'L' is an atom, or not?
P.S. I'm asking this question due to LISP course at my university, where we have a definition of 'simple expression' - it is an expression, which is atom or function call of one or two atomic parameters. Therefore I wonder if expression (cddddr L) is simple, which depends on whether 'L' is atomic parameter or not.
Your Lisp course's private definition of "simple expression" is almost certainly rooted purely in syntax. The idea of "atomic parameter" means that it's not a compound expression. It probably has nothing to do with the run-time value!
Thus, I'm guessing, these are simple expressions:
(+ 1 2)
42
"abc"
whereas these are not:
(+ 1 (* 3 4)) ;; (* 3 4) is not an atomic parameter
(+ a b c) ;; parameters atomic, but more than two
(foo) ;; not simple: fewer than one parameter, not "one or two"
In light of the last counterexample, it would probably behoove them to revise their definition.
On the one hand, L is not an atom, because it is cons, because it is the list (if we are talking about object, which is associated with the symbol L).
You are talking here about the meaning of the code being executed, its semantics. L here stands for a value, which is a list in your tests. At runtime you can inspect values and ask about their types.
On the other hand, if we are talking about 'L' itself (not about its content, but about the symbol 'L'), it is an atom, because it is not a cons.
Here you are looking at the types of the values that make up the syntax of your code, how it is being represented before even being evaluated (or compiled). You are manipulating a tree of symbols, one of them being L. In the source code, this is a symbol. It has no meaning by itself other than being a name.
Code is data
Lisp makes it easy to represent source code using values in the language itself, and easy to manipulate fragments of code at one point to build code that is executed later. This is often called homoiconicity, thought it is somewhat a touchy word because people don't always think the definition is precise enough to be useful. Another saying is "code is data", something that most language designers and programmers will agree to be true.
Lisp code can be built at runtime as follows (> is the prompt of the REPL, what follows is the result of evaluation):
> (list 'defun 'foo (list 'l) (list 'car 'l))
(DEFUN FOO (L) (CAR L))
The resulting form happens to be valid Common Lisp code, not just a generic list of values. If you evaluate it with (eval *), you will define a function named FOO that takes the first element of some list L.
NB. In Common Lisp the asterisk * is bound in the REPL to the last value being successfully returned.
Usually you don't build code like that, the Lisp reader turns a stream of characters into such a tree. For example:
> (read-from-string "(defun foo (l) (car l))")
(DEFUN FOO (L) (CAR L))
But the reader is called also implicitly in the REPL (that's the R in the acronym).
In such a tree of symbols, L is a symbol.
Evaluation model
When you call function FOO after it has been defined, you are evaluating the body of FOO in a context where L is bound to some value. And the rule for evaluating a symbol is to lookup the value it is bound to, and return that. This is the semantics of the code, which the runtime implements for you.
If you are using a simple interpreter, maybe the symbol L is present somewhere at runtime and its binding is looked up. Usually the code is not interpreted like that, it is possible to analyze it and transform it in an efficient way, during compilation. The result of compilation probably does not manipulate symbols anymore here, it just manipulates CPU registers and memory.
In all cases, asking for the type of L at this point is by definition of the semantics just asking the type for whatever value is bound to L in the context it appears.
An atom is anything that is not a cons cell
Really, the definition of atom is no more complex than that. A value in the language that is not a cons-cell is called an atom. This encompasses numbers, strings, everything.
Sometimes you evaluate a tree of symbols that happens to be code, but then the same rule applies.
Simple expressions
P.S. I'm asking this question due to LISP course at my university, where we have a definition of 'simple expression' - it is an expression, which is atom or function call of one or two atomic parameters. Therefore I wonder if expression (cddddr L) is simple, which depends on whether 'L' is atomic parameter or not.
In that course you are writing functions that analyze code. You are given a Lisp value and must decide if it is a simple expression or not.
You are not interested in any particular interpretation of the value being given, at no point you are going to traverse the value, see a a symbol and try to resolve it to a value: you are checking if the syntax is a valid simple expression or not.
Note also that the definitions in your course might be a bit different than the one from any particular exising flavor (this is not necessarily Common Lisp, or Scheme, but a toy LISP dialect). Follow in priority the definitions from your course.
Imagine we have a predicate which tells us if an object is not a number:
(not-number-p 3) -> NIL
(not-number-p "string") -> T
(let ((foo "another string))
(not-number-p foo)) -> T
(not-number '(1 2 3)) -> T
(not-number (first '(1 2 3)) -> NIL
We can define that as:
(defun not-number-p (object)
(not (numberp object))
Above is just the opposite of NUMBERP.
NUMBERP -> T if object is a number
NOT-NUMBER-P -> NIL if object is a number
Now imagine we have a predicate NOT-CONS-P, which tells us if an object is not a CONS cell.
(not-cons-p '(1 . 2)) -> NIL
(let ((c '(1 . 2)))
(not-cons-p c)) -> NIL
(not-cons-p 3) -> T
(let ((n 4))
(not-cons-p n)) -> T
(not-cons-p NIL) -> T
(not-cons-p 'NIL) -> T
(not-cons-p 'a-symbol) -> T
(not-cons-p #\space) -> T
The function NOT-CONS-P can be defined as:
(defun not-cons-p (object)
(if (consp object)
NIL
T))
Or shorter:
(defun not-cons-p (object)
(not (consp object))
The function NOT-CONS-P is traditionally called ATOM in Lisp.
In Common Lisp every object which is not a cons cell is called an atom. The function ATOM is a predicate.
See the Common Lisp HyperSpec: Function Atom
Your question:
(cadr (caddar (cddddr L)))
Is 'L' an atom?
How would we know that? L is a variable. What is the value of L?
(let ((L 10))
(atom L)) -> T
(let ((L (cons 1 2)))
(atom L) -> NIL
(atom l) answers this question:
-> is the value of L an atom
(atom l) does not answer this question:
-> is L an atom? L is a variable and in a function call the value of L is passed to the function ATOM.
If you want to ask if the symbol L is an atom, then you need to quote the symbol:
(atom 'L) -> T
(atom (quote L)) -> T
symbols are atoms. Actually everything is an atom, with the exception of cons cells.

Are symbols and names different?

Are symbols and names different? On Lisp by Paul Graham, which focuses on common lisp, has some discussions that seem to imply so, e.g.
Since lambda-expressions are also names of functions, they can also appear first in function calls:
((lambda (x) (* x 2) 3)
6
This makes it sound like symbols are names but names aren't symbols. But I don't understand what sort of Lisp "object" symbols are / could be.
This is also deriving from my question here on the sharp-quote (#') operator v. symbol-function. I'm suspecting the only reason these are different is because not all names are symbols but I don't have enough background to understand those answers yet (hence this question).
I'm also asking for clarification in elisp v. common lisp. I'm assuming this pertains to lexical form, which wasn't introduced in elisp until version 24 (24.1 I think).
Lambda Expressions are not names of functions. It's just that ((lambda (...) ...) ...) is allowed in Common Lisp, since it is defined in the standard as legal syntax.
The only allowed function names in Common Lisp are symbols and lists like (setf symbol).
For example one can write
(defun (setf foo) (...) ...)
Here (setf foo) is the function name.
Other function names don't exist in Common Lisp, only symbols and (setf symbol) names.
Common Lisp Hyperspec Glossary: Function Name.
function name n. 1. (in an environment) A symbol or a list (setf
symbol) that is the name of a function in that environment. 2. A
symbol or a list (setf symbol).
Note: the Common Lisp version of 1984 (as published in CLtL1) did only have symbols as function names. Thus the idea of a function name was not defined. The function to retrieve a function from a symbol was called SYMBOL-FUNCTION. In 1989 the ANSI CL standardization group decided to add setf lists as function names. It also introduced the function FDEFINITION, which is like SYMBOL-FUNCTION but also accepts other function names, besides symbols. See here: Issue FUNCTION-NAME.
I think Rainer's answer gets this right, but since the question appeared in a comment on my answer to another question, I'll include (an update to) my response from that comment.
A symbol is an actual object. You can inspect it, you can create them with make-symbol, etc. Importantly, symbols are one of the primary components of source code in Common Lisp. Function names, especially in the context that this question arose in (arguments to the function special operator) are either symbols or lists of the form (setf symbol) according to the glossary entry:
function name n. 1. (in an environment) A symbol or a list (setf symbol) that is the name of a function in that environment. 2. A
symbol or a list (setf symbol).
Function is a special operator that doesn't evaluate its arguments, so passing a symbol or setf list means something like:
(function car)
(function (setf car))
and not:
(function 'car)
(function '(setf car))
Now, lexical variables, e.g., x in (let ((x 42)) x), while represented by symbols in the source code, don't actually have any connection with the symbol at runtime. The compiled version of (let ((x 42)) x) doesn't need to know anything about the symbol x. Intuitively, this makes sense, because we'd expect the code (let ((y 42)) y) to compile to the same thing. However, when a variable is special, there is a connection with the symbol. The difference is clearest with:
(let ((x 42))
(symbol-value x))
;=> NIL
(let ((x 42))
(declare (special x)) ; or (defparameter x ...) or (defvar x ...) earlier
(symbol-value x))
;=> 42
We'd expect the same thing to be true of lexically scoped functions, e.g., the following code causes an error because there's no connection between the symbol x at runtime and the local function:
(flet ((x () 42))
(symbol-function 'x)) ; ERROR, no function value for symbol x
But even so, we can still do:
(flet ((x () 42))
(function x))
This is because function is special operator and can access the environment where is occurs. That means that (because it's special, and the implementation makes it work) it can know that x is defined as function here. It may be interesting to note, now, that since flet and labels are defined to take a function name, you can do:
(flet (((setf kar) (value kons)
...))
...)

How do I define a function that creates a function alias?

The Lisp forum thread Define macro alias? has an example of creating function alias using a form such as
(setf (symbol-function 'zero?) #'zerop)
This works fine, making zero? a valid predicate. Is it possible to parametrize this form without resorting to macros? I'd like to be able to call the following and have it create function?:
(define-predicate-alias 'functionp)`
My take was approximately:
(defun defalias (old new)
(setf (symbol-function (make-symbol new))
(symbol-function old)))
(defun define-predicate-alias (predicate-function-name)
(let ((alias (format nil "~A?" (string-right-trim "-pP" predicate-function-name))))
(defalias predicate-function-name alias)))
(define-predicate-alias 'zerop)
(zero? '())
This fails when trying to call zero? saying
The function COMMON-LISP-USER::ZERO? is undefined.
make-symbol creates an uninterned symbol. That's why zero? is undefined.
Replace your (make-symbol new) with e.g. (intern new *package*). (Or you may want to think more carefully in which package to intern your new symbol.)
Your code makes a symbol, via MAKE-SYMBOL, but you don't put it into a package.
Use the function INTERN to add a symbol to a package.
To expand on Lars' answer, choose the right package. In this case the default might be to use the same package from the aliased function:
About style:
Anything that begins with DEF should actually be a macro. If you have a function, don't use a name beginning with "DEF". If you look at the Common Lisp language, all those are macro. For example: With those defining forms, one would typically expect that they have a side-effect during compilation of files: the compiler gets informed about them. A function can't.
If I put something like this in a file
(define-predicate-alias zerop)
(zero? '())
and then compile the file, I would expect to not see any warnings about an undefined ZERO?. Thus a macro needs to expand (define-predicate-alias 'zerop) into something which makes the new ZERO? known into the compile-time environment.
I would also make the new name the first argument.
Thus use something like MAKE-PREDICATE-ALIAS instead of DEFINE-PREDICATE-ALIAS, for the function.
There are already some answers that explain how you can do this, but I'd point out:
Naming conventions, P, and -P
Common Lisp has a naming convention that is mostly adhered to (there are exceptions, even in the standard library), that if a type name is multiple words (contains a -), then its predicate is named with -P suffix, whereas if it doesn't, the suffix is just P. So we'd have keyboardp and lcd-monitor-p. It's good then, that you're using (string-right-trim "-pP" predicate-function-name)), but since the …P and …-P names in the standard, and those generated by, e.g., defstruct, will be using P, not p, you might just use (string-right-trim "-P" predicate-function-name)). Of course, even this has the possible issues with some names (e.g., pop), but I guess that just comes with the territory.
Symbol names, format, and *print-case*
More importantly, using format to create symbol names for subsequent interning is dangerous, because format doesn't always print a symbol's name with the characters in the same case that they actually appear in its name. E.g.,
(let ((*print-case* :downcase))
(list (intern (symbol-name 'foo))
(intern (format nil "~A" 'foo))))
;=> (FOO |foo|) ; first symbol has name "FOO", second has name "foo"
You may be better off using string concatenation and extracting symbol names directly. This means you could write code like (this is slightly different use case, since the other questions already explain how you can do what you're trying to do):
(defmacro defpredicate (symbol)
(flet ((predicate-name (symbol)
(let* ((name (symbol-name symbol))
(suffix (if (find #\- name) "-P" "P")))
(intern (concatenate 'string name suffix)))))
`(defun ,(predicate-name symbol) (x)
(typep x ',symbol)))) ; however you're checking the type
(macroexpand-1 '(defpredicate zero))
;=> (DEFUN ZEROP (X) (TYPEP X 'ZERO))
(macroexpand-1 '(defpredicate lcd-monitor))
;=> (DEFUN LCD-MONITOR-P (X) (TYPEP X 'LCD-MONITOR))

Why can't CLISP call certain functions with uninterned names?

I've written an ad hoc parser generator that creates code to convert an old and little known 7-bit character set into unicode. The call to the parser generator expands into a bunch of defuns enclosed in a progn, which then get compiled. I only want to expose one of the generated defuns--the top-level one--to the rest of the system; all the others are internal to the parser and only get called from within the dynamic scope of the top-level one. Therefore, the other defuns generated have uninterned names (created with gensym). This strategy works fine with SBCL, but I recently tested it for the first time with CLISP, and I get errors like:
*** - FUNCALL: undefined function #:G16985
It seems that CLISP can't handle functions with uninterned names. (Interestingly enough, the system compiled without a problem.) EDIT: It seems that it can handle functions with uninterned names in most cases. See the answer by Rörd below.
My questions is: Is this a problem with CLISP, or is it a limitation of Common Lisp that certain implementations (e.g. SBCL) happen to overcome?
EDIT:
For example, the macro expansion of the top-level generated function (called parse) has an expression like this:
(PRINC (#:G75735 #:G75731 #:G75733 #:G75734) #:G75732)
Evaluating this expression (by calling parse) causes an error like the one above, even though the function is definitely defined within the very same macro expansion:
(DEFUN #:G75735 (#:G75742 #:G75743 #:G75744) (DECLARE (OPTIMIZE (DEBUG 2)))
(DECLARE (LEXER #:G75742) (CONS #:G75743 #:G75744))
(MULTIPLE-VALUE-BIND (#:G75745 #:G75746) (POP-TOKEN #:G75742)
...
The two instances of #:G75735 are definitely the same symbol--not two different symbols with the same name. As I said, this works with SBCL, but not with CLISP.
EDIT:
SO user Joshua Taylor has pointed out that this is due to a long standing CLISP bug.
You don't show one of the lines that give you the error, so I can only guess, but the only thing that could cause this problem as far as I can see is that you are referring to the name of the symbol instead of the symbol itself when trying to call it.
If you were referring to the symbol itself, all your lisp implementation would have to do is lookup that symbol's symbol-function. Whether it's interned or not couldn't possibly matter.
May I ask why you haven't considered another way to hide the functions, i.e. a labels statement or defining the functions within a new package that exports only the one external function?
EDIT: The following example is copied literally from an interaction with the CLISP prompt.
As you can see, calling the function named by a gensym is working as expected.
[1]> (defmacro test ()
(let ((name (gensym)))
`(progn
(defun ,name () (format t "Hello!"))
(,name))))
TEST
[2]> (test)
Hello!
NIL
Maybe your code that's trying to call the function gets evaluated before the defun? If there's any code in the macro expansion besides the various defuns, it may be implementation-dependent what gets evaluated first, and so the behaviour of SBCL and CLISP may differ without any of them violating the standard.
EDIT 2: Some further investigation shows that CLISP's behaviour varies depending upon whether the code is interpreted directly or whether it's first compiled and then interpreted. You can see the difference by either directly loading a Lisp file in CLISP or by first calling compile-file on it and then loading the FASL.
You can see what's going on by looking at the first restart that CLISP offers. It says something like "Input a value to be used instead of (FDEFINITION '#:G3219)." So for compiled code, CLISP quotes the symbol and refers to it by name.
It seems though that this behaviour is standard-conforming. The following definition can be found in the HyperSpec:
function designator n. a designator for a function; that is, an object that denotes a function and that is one of: a symbol (denoting the function named by that symbol in the global environment), or a function (denoting itself). The consequences are undefined if a symbol is used as a function designator but it does not have a global definition as a function, or it has a global definition as a macro or a special form. See also extended function designator.
I think an uninterned symbol matches the "a symbol is used as a function designator but it does not have a global definition as a function" case for unspecified consequences.
EDIT 3: (I can agree that I'm not sure whether CLISP's behaviour is a bug or not. Someone more experienced with details of the standard's terminology should judge this. It comes down to whether the function cell of an uninterned symbol - i.e. a symbol that cannot be referred to by name, only by having a direct hold on the symbol object - would be considered a "global definition" or not)
Anyway, here's an example solution that solves the problem in CLISP by interning the symbols in a throwaway package, avoiding the matter of uninterned symbols:
(defmacro test ()
(let* ((pkg (make-package (gensym)))
(name (intern (symbol-name (gensym)) pkg)))
`(progn
(defun ,name () (format t "Hello!"))
(,name))))
(test)
EDIT 4: As Joshua Taylor notes in a comment to the question, this seems to be a case of the (10 year old) CLISP bug #180.
I've tested both workarounds suggested in that bug report and found that replacing the progn with locally actually doesn't help, but replacing it with let () does.
You can most certainly define functions whose names are uninterned symbols. For instance:
CL-USER> (defun #:foo (x)
(list x))
#:FOO
CL-USER> (defparameter *name-of-function* *)
*NAME-OF-FUNCTION*
CL-USER> *name-of-function*
#:FOO
CL-USER> (funcall *name-of-function* 3)
(3)
However, the sharpsign colon syntax introduces a new symbol each time such a form is read read:
#: introduces an uninterned symbol whose name is symbol-name. Every time this syntax is encountered, a distinct uninterned symbol is created. The symbol-name must have the syntax of a symbol with no package prefix.
This means that even though something like
CL-USER> (list '#:foo '#:foo)
;=> (#:FOO #:FOO)
shows the same printed representation, you actually have two different symbols, as the following demonstrates:
CL-USER> (eq '#:foo '#:foo)
NIL
This means that if you try to call such a function by typing #: and then the name of the symbol naming the function, you're going to have trouble:
CL-USER> (#:foo 3)
; undefined function #:foo error
So, while you can call the function using something like the first example I gave, you can't do this last one. This can be kind of confusing, because the printed representation makes it look like this is what's happening. For instance, you could write such a factorial function like this:
(defun #1=#:fact (n &optional (acc 1))
(if (zerop n) acc
(#1# (1- n) (* acc n))))
using the special reader notation #1=#:fact and #1# to later refer to the same symbol. However, look what happens when you print that same form:
CL-USER> (pprint '(defun #1=#:fact (n &optional (acc 1))
(if (zerop n) acc
(#1# (1- n) (* acc n)))))
(DEFUN #:FACT (N &OPTIONAL (ACC 1))
(IF (ZEROP N)
ACC
(#:FACT (1- N) (* ACC N))))
If you take that printed output, and try to copy and paste it as a definition, the reader creates two symbols named "FACT" when it comes to the two occurrences of #:FACT, and the function won't work (and you might even get undefined function warnings):
CL-USER> (DEFUN #:FACT (N &OPTIONAL (ACC 1))
(IF (ZEROP N)
ACC
(#:FACT (1- N) (* ACC N))))
; in: DEFUN #:FACT
; (#:FACT (1- N) (* ACC N))
;
; caught STYLE-WARNING:
; undefined function: #:FACT
;
; compilation unit finished
; Undefined function:
; #:FACT
; caught 1 STYLE-WARNING condition
I hope I get the issue right. For me it works in CLISP.
I tried it like this: using a macro for creating a function with a GENSYM-ed name.
(defmacro test ()
(let ((name (gensym)))
`(progn
(defun ,name (x) (* x x))
',name)))
Now I can get the name (setf x (test)) and call it (funcall x 2).
Yes, it is perfectly fine defining functions that have names that are unintenred symbols. The problem is that you cannot then call them "by name", since you can't fetch the uninterned symbol by name (that is what "uninterned" means, essentially).
You would need to store the uninterned symbol in some sort of data structure, to then be able to fetch the symbol. Alternatively, store the defined function in some sort of data structure.
Surprisingly, CLISP bug 180 isn't actually an ANSI CL conformance bug. Not only that, but evidently, ANSI Common Lisp is itself so broken in this regard that even the progn based workaround is a courtesy of the implementation.
Common Lisp is a language intended for compilation, and compilation produces issues regarding the identity of objects which are placed into compiled files and later loaded ("externalized" objects). ANSI Common Lisp requires that literal objects reproduced from compiled files are only similar to the original objects. (CLHS 3.2.4 Literal Objects in Compiled Files).
Firstly, according to the definition similarity (3.2.4.2.2 Definition of Similarity), the rules for uninterned symbols is that similarity is name based. If we compile code with a literal that contains an uninterned symbol, then when we load the compiled file, we get a symbol which is similar and not (necessarily) the same object: a symbol which has the same name.
What if the same uninterned symbol is inserted into two different top-level forms which are then compiled as a file? When the file is loaded, are those two similar to each other at least? No, there is no such requirement.
But it gets worse: there is also no requirement that two occurrences of the same uninterned symbol in the same form will be externalized in such a way that their relative identity is preserved: that the re-loaded version of that object will have the same symbol object in all the places where the original was. In fact, the definition of similarity contains no provision for preserving the circular structure and substructure sharing. If we have a literal like '#1=(a b . #1#), as a literal in a compiled file, there appears to be no requirement that this be reproduced as a circular object with the same graph structure as the original (a graph isomorphism). The similarity rule for conses is given as naive recursion: two conses are similar if their respective cars and cdrs are similar. (The rule can't even be evaluated for circular objects; it doesn't terminate).
That the above works is because of implementations going beyond what is required in the spec; they are providing an extension consistent with (3.2.4.3 Extensions to Similarity Rules).
Thus, purely according to ANSI CL, we cannot expect to use macros with gensyms in compiled files, at least in some ways. The expectation expressed in code like the following runs afoul of the spec:
(defmacro foo (arg)
(let ((g (gensym))
(literal '(blah ,g ,g ,arg)))
...))
(defun bar ()
(foo 42))
The bar function contains a literal with two insertions of a gensym, which according to the similarity rules for conses and symbols need not reproduce as a list containing two occurrences of the same object in the second and third positions.
If the above works as expected, it's due to "extensions to the similarity rules".
So the answer to the "Why can't CLISP ..." question is that although CLISP does provide an extension for similarity which preserves the graph structure of literal forms, it doesn't do it across the entire compiled file, only within individual top level items within that file. (It uses *print-circle* to emit the individual items.) The bug is that CLISP doesn't conform to the best possible behavior users can imagine, or at least to a better behavior exhibited by other implementations.

How to acquire unique object id in Emacs Lisp?

Does emacs lisp have a function that provides a unique object identifier, such as e.g. a memory address? Python has id(), which returns an integer guaranteed to be unique among presently existing objects. What about elisp?
The only reason I know for wanting a function like id() is to compare objects, and ensure that they only compare equal if they are the same (as in, in the same memory location). In Lisps, this is done a bit differently from in Python:
In most lisps, including elisp, there are several different notions of equality. The most expensive, and weakest equivalence is equal. This is not what you want, since two lists (say) are equal if they have the same elements (tested recursively with equal). As such
(equal (list 1 2) (list 1 2)) => T
is true. At the other end of the spectrum is eq, which tests "identity" rather than equality:
(eq (list 1 2) (list 1 2)) => NIL
This is what you want, I think.
So, it seems that Python works by providing one equality test, and then a function that gives you a memory location for each object, which then can be compared as integers. In Elisp (and at least Common Lisp too), on the other hand, there is more than one meaning of "equality".
Note, there is also "eql", which lies somewhere between the two.
(EDIT: My original answer probably wasn't clear enough about why the distinction between eq and equal probably solves the problem the original poster was having)
There is no such feature in Emacs Lisp, as far as I know. If you only need equality, use eq, which performs a pointer comparison behind the scenes.
If you need a printable unique identifier, use gensym from the cl package.
If you need a unique identifier to serve as an index in a data structure, use gensym (or maintain your own unique id — gensym is simpler and less error-prone).
Some languages bake a unique id into every object, but this has a cost: either every object needs extra memory to store the id, or the id is derived from the address of the object, which precludes modifying the address. Python chooses to pay the cost, Emacs chooses not to.
My whole point in asking the question was that I was looking for a way to distinguish between the printed representations of different symbols that have the same name. Thanks to the elisp manual, I've discovered the variable print-gensym, which, when non-nil, causes #: to be prepended to uninterned symbols printed. Moreover, if the same call to print prints the same uninterned symbol more than once, it will mark the first one with #N= and subsequent ones with `#N#. This is exactly the kind of functionality I was looking for. For example:
(setq print-gensym t)
==> t
(make-symbol "foo")
==> #:foo
(setq a (make-symbol "foo"))
==> #:foo
(cons a a)
==> (#1=#:foo . #1#)
(setq b (make-symbol "foo"))
==> #:foo
(cons a b)
==> (#:foo . #:foo)
The #: notation works for read as well:
(setq a '#:foo)
==> #:foo
(symbol-name a)
==> "foo"
Note the ' on '#:foo--the #: notation is a symbol-literal. Without the ', the uninterned symbol is evaluated:
(symbol-name '#:foo)
==> "foo"
(symbol-name #:foo)
==> (void-variable #:foo)