Related
contextualization: I've been doing a university project in which I have to write a parser for regular expressions and build the corresponding epsilon-NFA. I have to do this in Prolog and Lisp.
I don't know if questions like this are allowed, if not I apologize.
I heard some of my classmates talking about how they used the function gensym for that, I asked them what it did and even checked up online but I literally can't understand what this function does neither why or when is best to use it.
In particular, I'm more intrested in what it does in Lisp.
Thank you all.
GENSYM creates unique symbols. Each call creates a new symbol. The symbol usually has a name which includes a number, which is counted up. The name is also unique (the symbol itself is already unique) with a number, so that a human reader can identify different uninterned symbols in the source code.
CL-USER 39 > (gensym)
#:G1083
CL-USER 40 > (gensym)
#:G1084
CL-USER 41 > (gensym)
#:G1085
CL-USER 42 > (gensym)
#:G1086
gensym is often used in Lisp macros for code generation, when the macro needs to create new identifiers, which then don't clash with existing identifiers.
Example: we are going to double the result of a Lisp form and we are making sure that the Lisp form itself will be computed only once. We do that by saving the value in a local variable. The identifier for the local variable will be computed by gensym.
CL-USER 43 > (defmacro double-it (it)
(let ((new-identifier (gensym)))
`(let ((,new-identifier ,it))
(+ ,new-identifier ,new-identifier))))
DOUBLE-IT
CL-USER 44 > (macroexpand-1 '(double-it (cos 1.4)))
(LET ((#:G1091 (COS 1.4)))
(+ #:G1091 #:G1091))
T
CL-USER 45 > (double-it (cos 1.4))
0.33993432
a little clarification of the existing answers (as the op is not yet aware of the typical common lisp macros workflow):
consider the macro double-it, proposed by mr. Joswig. Why would we bother creating this whole bunch of let? when it can be simply:
(defmacro double-it (it)
`(+ ,it ,it))
and ok, it seems to be working:
CL-USER> (double-it 1)
;;=> 2
but look at this, we want to increment x and double it
CL-USER> (let ((x 1))
(double-it (incf x)))
;;=> 5
;; WHAT? it should be 4!
the reason can be seen in macro expansion:
(let ((x 1))
(+ (setq x (+ 1 x)) (setq x (+ 1 x))))
you see, as the macro doesn't evaluate form, just splices it into generated code, it leads to incf being executed twice.
the simple solution is to bind it somewhere, and then double the result:
(defmacro double-it (it)
`(let ((x ,it))
(+ x x)))
CL-USER> (let ((x 1))
(double-it (incf x)))
;;=> 4
;; NICE!
it seems to be ok now. really it expands like this:
(let ((x 1))
(let ((x (setq x (+ 1 x))))
(+ x x)))
ok, so what about the gensym thing?
let's say, you want to print some message, before doubling your value:
(defmacro double-it (it)
`(let* ((v "DOUBLING IT")
(val ,it))
(princ v)
(+ val val)))
CL-USER> (let ((x 1))
(double-it (incf x)))
;;=> DOUBLING IT
;;=> 4
;; still ok!
but what if you accidentally name value v instead of x:
CL-USER> (let ((v 1))
(double-it (incf v)))
;;Value of V in (+ 1 V) is "DOUBLING IT", not a NUMBER.
;; [Condition of type SIMPLE-TYPE-ERROR]
It throws this weird error! Look at the expansion:
(let ((v 1))
(let* ((v "DOUBLING IT") (val (setq v (+ 1 v))))
(princ v)
(+ val val)))
it shadows the v from the outer scope with string, and when you are trying to add 1, well it obviously can't. Too bad.
another example, say you want to call the function twice, and return 2 results as a list:
(defmacro two-funcalls (f v)
`(let ((x ,f))
(list (funcall x ,v) (funcall x ,v))))
CL-USER> (let ((y 10))
(two-funcalls (lambda (z) z) y))
;;=> (10 10)
;; OK
CL-USER> (let ((x 10))
(two-funcalls (lambda (z) z) x))
;; (#<FUNCTION (LAMBDA (Z)) {52D2D4AB}> #<FUNCTION (LAMBDA (Z)) {52D2D4AB}>)
;; NOT OK!
this class of bugs is very nasty, since you can't easily say what's happened.
What is the solution? Obviously not to name the value v inside macro. You need to generate some sophisticated name that no one would reproduce in their code, like my-super-unique-value-identifier-2019-12-27. This would probably save you, but still you can't really be sure. That's why gensym is there:
(defmacro two-funcalls (f v)
(let ((fname (gensym)))
`(let ((,fname ,f))
(list (funcall ,fname ,v) (funcall ,fname ,v)))))
expanding to:
(let ((y 10))
(let ((#:g654 (lambda (z) z)))
(list (funcall #:g654 y) (funcall #:g654 y))))
you just generate the var name for the generated code, it is guaranteed to be unique (meaning no two gensym calls would generate the same name for the runtime session),
(loop repeat 3 collect (gensym))
;;=> (#:G645 #:G646 #:G647)
it still can potentially be clashed with user var somehow, but everybody knows about the naming and doesn't call the var #:GXXXX, so you can consider it to be impossible. You can further secure it, adding prefix
(loop repeat 3 collect (gensym "MY_GUID"))
;;=> (#:MY_GUID651 #:MY_GUID652 #:MY_GUID653)
GENSYM will generate a new symbol at each call. It will be garanteed, that the symbol did not exist before it will be generated and that it will never be generated again. You may specify a symbols prefix, if you like:
CL-USER> (gensym)
#:G736
CL-USER> (gensym "SOMETHING")
#:SOMETHING737
The most common use of GENSYM is generating names for items to avoid name clashes in macro expansion.
Another common purpose is the generaton of symbols for the construction of graphs, if the only thing demand you have is to attach a property list to them, while the name of the node is not of interest.
I think, the task of NFA-generation could make good use of the second purpose.
This is a note to some of the other answers, which I think are fine. While gensym is the traditional way of making new symbols, in fact there is another way which works perfectly well and is often better I find: make-symbol:
make-symbol creates and returns a fresh, uninterned symbol whose name is the given name. The new-symbol is neither bound nor fbound and has a null property list.
So, the nice thing about make-symbol is it makes a symbol with the name you asked for, exactly, without any weird numerical suffix. This can be helpful when writing macros because it makes the macroexpansion more readable. Consider this simple list-collection macro:
(defmacro collecting (&body forms)
(let ((resultsn (make-symbol "RESULTS"))
(rtailn (make-symbol "RTAIL")))
`(let ((,resultsn '())
(,rtailn nil))
(flet ((collect (it)
(let ((new (list it)))
(if (null ,rtailn)
(setf ,resultsn new
,rtailn new)
(setf (cdr ,rtailn) new
,rtailn new)))
it))
,#forms
,resultsn))))
This needs two bindings which the body can't refer to, for the results, and the last cons of the results. It also introduces a function in a way which is intentionally 'unhygienic': inside collecting, collect means 'collect something'.
So now
> (collecting (collect 1) (collect 2) 3)
(1 2)
as we want, and we can look at the macroexpansion to see that the introduced bindings have names which make some kind of sense:
> (macroexpand '(collecting (collect 1)))
(let ((#:results 'nil) (#:rtail nil))
(flet ((collect (it)
(let ((new (list it)))
(if (null #:rtail)
(setf #:results new #:rtail new)
(setf (cdr #:rtail) new #:rtail new)))
it))
(collect 1)
#:results))
t
And we can persuade the Lisp printer to tell us that in fact all these uninterned symbols are the same:
> (let ((*print-circle* t))
(pprint (macroexpand '(collecting (collect 1)))))
(let ((#2=#:results 'nil) (#1=#:rtail nil))
(flet ((collect (it)
(let ((new (list it)))
(if (null #1#)
(setf #2# new #1# new)
(setf (cdr #1#) new #1# new)))
it))
(collect 1)
#2#))
So, for writing macros I generally find make-symbol more useful than gensym. For writing things where I just need a symbol as an object, such as naming a node in some structure, then gensym is probably more useful. Finally note that gensym can be implemented in terms of make-symbol:
(defun my-gensym (&optional (thing "G"))
;; I think this is GENSYM
(check-type thing (or string (integer 0)))
(let ((prefix (typecase thing
(string thing)
(t "G")))
(count (typecase thing
((integer 0) thing)
(t (prog1 *gensym-counter*
(incf *gensym-counter*))))))
(make-symbol (format nil "~A~D" prefix count))))
(This may be buggy.)
I'm just starting to learn the concept of macro functions.
My teacher has asked us to create a macro function that would function exactly the same way as incf.
Here is an example he has given us for pop
(defmacro mypop (nom)
(list 'prog1 (list 'car nom) (list 'setq nom (list 'cdr nom))) )
Here is the regular function I'm trying to turn into a macro:
(defun iincf (elem &optional num )
(cond
((not num) (setq elem (+ 1 elem)))
(t (setq elem (+ num elem))) ) )
Here is my attempt at turning it into a macro :
(defmacro myincf (elem &optional num )
(list 'cond
((list 'not num) (list 'setq elem (list '+ 1 elem)))
(t (list 'setq elem (list '+ num elem))) ) )
However, I get this error and I don't know why:
*** - system::%expand-form: (list 'not num) should be a lambda expression
Also, I'm not sure whether my function would actually change the value of the variable at the top level.
So here are my 2 questions:
Why do I get this error?
Is the function I'm trying to turn into a macro fine? (if successfully turning it into a macro function, would it do what I intend to?)
PS: I know this exercise would probably infringe many common rules in lisp, but this is just for practice. Thanks! :)
The reason for the error is that your syntax is invalid:
((list ...) ...)
(t (list ...))
The first element should be a function name or a lambda expression, so you would need to change it to something like
(list (list ...) ...)
(list t (list ...))
Although the macro isn't a very good one yet. First of all, the backquote syntax would make the code much more readable. It allows you to write a template where only the specified forms are evaluated. For example, the given MYPOP macro would look like
(defmacro mypop (nom)
`(prog1 (car ,nom)
(setq ,nom (cdr ,nom))))
Only the forms with a comma before them are evaluated. Same with your macro:
(defmacro myincf (elem &optional num)
`(cond
((not ,num) (setq ,elem (+ 1 ,elem)))
(t (setq ,elem (+ ,num ,elem)))))
The COND shouldn't really be part of the expansion though. It should be evaluated during macroexpansion, and only the SETQ form from one of the branches returned.
(defmacro myincf (elem &optional num)
(cond
((not num) `(setq ,elem (+ 1 ,elem)))
(t `(setq ,elem (+ ,num ,elem)))))
The only difference between the two branches is that the first one defaults to 1 for NUM. A simpler way to achieve the same would be to give NUM a default value.
(defmacro myincf (elem &optional (num 1))
`(setq ,elem (+ ,num ,elem)))
Of course, the standard INCF is a bit more complex, since it works for all sorts of places (not just variables) and ensures that the subforms of the place are evaluated only once. However, since the MYPOP example doesn't handle those, I don't think you have to either.
If you want to, a simple way to define such a macro would be
(define-modify-macro myincf (&optional (num 1)) +)
Or you could do the same manually with something like
(defmacro myincf (place &optional (num 1) &environment env)
(multiple-value-bind (dummies vals store setter getter)
(get-setf-expansion place env)
`(let* (,#(mapcar #'list dummies vals)
(,(first store) (+ ,getter ,num)))
,setter)))
But using DEFINE-MODIFY-MACRO would be preferrable in a real program (shorter code, less bugs). You could read about GET-SETF-EXPANSION and DEFINE-MODIFY-MACRO if you're interested.
In Section 12.4 of On Lisp, Paul Graham writes, "Unfortunately, we can't define a correct _f with define-modify-macro, because the operator to be applied to the generalized variable is given as an argument."
But what's wrong with something like this?
(define-modify-macro _f (op operand)
(lambda (x op operand)
(funcall op x operand)))
(let ((lst '(1 2 3)))
(_f (second lst) #'* 6)
lst)
=> (1 12 3)
Has there perhaps been a change made to define-modify-macro in ANSI Common Lisp that wasn't valid at the time On Lisp was written? Or are there reasons other than the one stated for not using define-modify-macro here?
It appears that Graham want's to be able to make a call such as
(_f * (second lst) 6)
rather than
(_f #'* (second lst) 6)
But surely that's not in keeping with a Lisp2 such as Common Lisp?
According to both Lispworks's Hyperspec and CLtL2 (look for define-modify-macro), the function is assumed to be a symbol (to a function or a macro). As far as I know, the following definition might not be conforming the specification:
(define-modify-macro _f (op operand)
(lambda (x op operand)
(funcall op x operand)))
But of course, it is possible that an implementation allows it.
To be sure you are conforming to the standard, you can define your own function, or even a macro:
(defmacro funcall-1 (val fun &rest args)
`(funcall ,fun ,val ,#args))
(define-modify-macro _ff (&rest args) funcall-1)
(let ((x (list 1 2 3 4)))
(_ff (third x) #'+ 10)
x)
If you wanted to have the function as a second argument, you could define another macro:
(defmacro ff (fun-form place &rest args)
`(_ff ,place ,fun-form ,#args))
Basically, your approach consists in wrapping funcall in define-modify-macro, and give the desired function as an argument of that function. At first sight, it looks like a hack, but as we can see below, this gives the same macroexanded code as the one in On Lisp, assuming we modify the latter a little.
The macroexpansion of the above is:
(LET ((X (LIST 1 2 3 4)))
(LET* ((#:G1164 X) (#:G1165 (FUNCALL #'+ (THIRD #:G1164) 10)))
(SB-KERNEL:%RPLACA (CDDR #:G1164) #:G1165))
X)
The version in On Lisp behaves as follows:
(defmacro _f (op place &rest args)
(multiple-value-bind (vars forms var set access)
(get-setf-expansion
place)
`(let* (,#(mapcar #'list vars forms)
(, (car var) (,op ,access ,#args)))
,set)))
(let ((x (list 1 2 3 4)))
(_f * (third x) 10)
x)
Macroexpansion:
(LET ((X (LIST 1 2 3 4)))
(LET* ((#:G1174 X) (#:G1175 (* (THIRD #:G1174) 10)))
(SB-KERNEL:%RPLACA (CDDR #:G1174) #:G1175))
X)
Here, the * is injected directly by the macroexpansion, which means that the resulting code has no possible runtime overhead (though compilers would probably handle your (funcall #'+ ...) equally well). If you pass #'+ to the macro, it fails to macroexpand. This is the major difference with your approach, but not a big limitation. In order to allow the On Lisp version to accept #'*, or even (create-closure) as an operator, it should be modified as follows:
(defmacro _f (op place &rest args)
(multiple-value-bind (vars forms var set access)
(get-setf-expansion
place)
`(let* (,#(mapcar #'list vars forms)
(, (car var) (funcall ,op ,access ,#args)))
,set)))
(see the call to funcall)
The previous example is then expanded as follows, for #'*:
(LET ((X (LIST 1 2 3 4)))
(LET* ((#:G1180 X) (#:G1181 (FUNCALL #'* (THIRD #:G1180) 10)))
(SB-KERNEL:%RPLACA (CDDR #:G1180) #:G1181))
X)
Now, it is exactly as your version. On Lisp uses _f to demonstrate how to use get-setf-expansion, and _f is a good example for that. But on the other hand, your implementation seems equally good.
On the question of whether one might prefer to pass * or #'*, we can also note that the define-modify-macro version of _f and #coredump's adapted version (with funcall) both accept lambda forms in the op position with or without #' e.g. both (lambda (x y) (* x y)) and #'(lambda (x y) (* x y)), whereas Graham's original version accepts only the former.
Interestingly in his book Let over Lambda, Doug Hoyte draws attention to a remark by Graham in his book ANSI Common Lisp that being able to omit the #' before a lambda form provides "a specious form of elegance at best" before going on to prefer to omit it.
I'm not taking a stand either way, merely pointing out that given Graham's choice for _f, the absence of the #' is no longer specious but necessary.
In Python, I am able to use yield to build up a list without having to define a temporary variable:
def get_chars_skipping_bar(word):
while word:
# Imperative logic which can't be
# replaced with a for loop.
if word[:3] == 'bar':
word = word[3:]
else:
yield foo[0]
foo = foo[1:]
In elisp, I can't see any way of doing this, either built-in or using any pre-existing libraries. I'm forced to manually build a up a list and call nreverse on it. Since this is a common pattern, I've written my own macro:
(require 'dash)
(require 'cl)
(defun replace-calls (form x func)
"Replace all calls to X (a symbol) in FORM,
calling FUNC to generate the replacement."
(--map
(cond
((consp it)
(if (eq (car it) x)
(funcall func it)
(replace-calls it x func)))
(:else it))
form))
(defmacro with-results (&rest body)
"Execute BODY, which may contain forms (yield foo).
Return a list built up from all the values passed to yield."
(let ((results (gensym "results")))
`(let ((,results (list)))
,#(replace-calls body 'yield
(lambda (form) `(push ,(second form) ,results)))
(nreverse ,results))))
Example usage:
(setq foo "barbazbarbarbiz")
(with-results
(while (not (s-equals? "" foo))
;; Imperative logic which can't be replaced with cl-loop's across.
(if (s-starts-with? "bar" foo)
(setq foo (substring foo 3))
(progn
(yield (substring foo 0 1))
(setq foo (substring foo 1))))))
There must be a better way of doing this, or an existing solution, somewhere in elisp, cl.el, or a library.
The Python function is actually a generator. In ANSI Common Lisp, we would usually reach for a lexical closure to simulate a generator, or else us a library to define generators directly, like Pygen. Maybe these approaches can be ported to Emacs Lisp.
AFAIK, people just use push+nreverse like you do. If you want to define your macro in a more robust way (e.g. so it doesn't misfire on something like (memq sym '(yield stop next))) you could do it as:
(defmacro with-results (&rest body)
"Execute BODY, which may contain forms (yield EXP).
Return a list built up from all the values passed to `yield'."
(let ((results (gensym "results")))
`(let ((,results '()))
(cl-macrolet ((yield (exp) `(push ,exp ,results)))
,#body)
(nreverse ,results))))
Maybe something like this:
(setq foo "barbaz")
(cl-loop for i from 0 to (1- (length foo))
collect (string (aref foo i)))
In any case, there's nothing wrong with push and nreverse.
Lisp is different from Python. yield is not used. I also see the use of coroutine-like constructs for this as a mistake. It's the equivalent of the come-from construct. Suddenly routines have multiple context dependent entry points.
In Lisp use functions/closures instead.
In Common Lisp, the LOOP macro allows efficient mappings over vectors. The following code can be abstracted to some mapping function, if preferred:
CL-USER 17 > (defun chars-without-substring (string substring)
(loop with i = 0
while (< i (length string))
when (and (>= (- (length string) i) (length substring))
(string= substring string
:start2 i
:end2 (+ i (length substring))))
do (incf i (length substring))
else
collect (prog1 (char string i) (incf i))))
CHARS-WITHOUT-SUBSTRING
CL-USER 18 > (chars-without-substring "barbazbarbarbiz" "bar")
(#\b #\a #\z #\b #\i #\z)
I'm reading/working through Practical Common Lisp. I'm on the chapter about building a test framework in Lisp.
I have the function "test-+" implemented as below, and it works:
(defun test-+ ()
(check
(= (+ 1 2) 3)
(= (+ 5 6) 11)
(= (+ -1 -6) -7)))
Remember, I said, it works, which is why what follows is so baffling....
Here is some code that "test-+" refers to:
(defmacro check (&body forms)
`(combine-results
,#(loop for f in forms collect `(report-result ,f ',f))))
(defmacro combine-results (&body forms)
(with-gensyms (result)
`(let ((,result t))
,#(loop for f in forms collect `(unless ,f (setf ,result nil)))
,result)))
(defmacro with-gensyms ((&rest names) &body body)
`(let ,(loop for n in names collect `(,n (gensym)))
,#body))
(defun report-result (value form)
(format t "~:[FAIL~;pass~] ... ~a~%" value form)
value)
Now, what I've been doing is using Slime to macro-expand these, step by step (using ctrl-c RET, which is mapped to macroexpand-1).
So, the "check" call of "test-+" expands to this:
(COMBINE-RESULTS
(REPORT-RESULT (= (+ 1 2) 3) '(= (+ 1 2) 3))
(REPORT-RESULT (= (+ 5 6) 11) '(= (+ 5 6) 11))
(REPORT-RESULT (= (+ -1 -6) -7) '(= (+ -1 -6) -7)))
And then that macro-expands to this:
(LET ((#:G2867 T))
(UNLESS (REPORT-RESULT (= (+ 1 2) 3) '(= (+ 1 2) 3)) (SETF #:G2867 NIL))
(UNLESS (REPORT-RESULT (= (+ 5 6) 11) '(= (+ 5 6) 11)) (SETF #:G2867 NIL))
(UNLESS (REPORT-RESULT (= (+ -1 -6) -7) '(= (+ -1 -6) -7))
(SETF #:G2867 NIL))
#:G2867)
And it is THAT code, directly above this sentence, which doesn't work. If I paste that into the REPL, I get the following error (I'm using Clozure Common Lisp):
Unbound variable: #:G2867 [Condition of type UNBOUND-VARIABLE]
Now, if I take that same code, replace the gensym with a variable name such as "x", it works just fine.
So, how can we explain the following surprises:
The "test-+" macro, which calls all of this, works fine.
The macro-expansion of the "combine-results" macro does not run.
If I remove the gensym from the macro-expansion of "combine-results", it
does work.
The only thing I can speculate is that you cannot use code the contains literal usages of gensyms. If so, why not, and how does one work around that? And if that is not the explanation, what is?
Thanks.
GENSYM creates uninterned symbols. When the macro runs normally, this isn't a problem, because the same uninterned symbol is being substituted throughout the expression.
But when you copy and paste the expression into the REPL, this doesn't happen. #: tells the reader to return an uninterned symbol. As a result, each occurrence of #:G2867 is a different symbol, and you get the unbound variable warning.
If you do (setq *print-circle* t) before doing the MACROEXPAND it will use #n= and #n# notation to link the identical symbols together.
The code, after being printed and read back, is no longer the same code. In particular, the two instances of #:G2867 in the printed representation would be read back as two separated symbols (albeit sharing the same name), while they should be the same in the original internal representation.
Try setting *PRINT-CIRCLE* to T to preserve the identity in the printed representation of the macro-expanded code.