What is the best way of combining &key and &rest in a lisp macro's lambda list? - macros

I implemented Heap's algorithm using a macro. It's working OK, but I would like to tweak it so it will generate anaphoric or non-anaphoric code on demand. In other words, I would like to have the macro either make an internal copy of the sequence it will permutate or work on a sequence available outside the macro.
My utterly unsatisfactory, downright embarrassing code is:
;; Anaphoric version
;; To make it non-anaphoric, substitute (,var (copy-seq ,vec)) for (,var ,vec)
(defmacro run-permutations (var vec &rest body)
"Executes body for all permutations of vec, which is stored in variable var"
`(let ((,var ,vec))
(labels ((generate (&optional (n (length ,var)))
(if (= n 1)
(progn ,#body)
(progn
(loop for i from 0 below (1- n)
do (progn
(generate (1- n))
(rotatef (aref ,var (if (evenp n) i 0))
(aref ,var (1- n)))))
(generate (1- n))))))
(generate))))
? (run-permutations v "123" (pprint v))
"123"
"213"
"312"
"132"
"231"
"321"
?
I would like to write something that worked like this...
? (setf v "123")
? (run-permutations :anaphoric t v "123" (...do stuff...))
? v
"321"
? (setf v "123")
? (run-permutations v "123" (...do stuff...))
? v
"123"
...but I haven't found a satisfactory combination of &rest and &key or any other approach for writing the lambda list.
So my question is: is there a way of accomplishing that, preferably without writing more code to parse the macro's lambda list? Or is there another, more or less standard (and presumably more elegant) solution out there? I strongly suspect the latter.
Your input is much appreciated. As always, any other comments on the code are appreciated as well.
UPDATE
Brilliant! I opted to use a gensym for n because body is called from within the recursion and I can't see how it could be called from elsewhere—at least not without rewriting everything.
I've also added another feature and a minor optimization. In case you're curious, the updated version is:
(defmacro do-permutations ((var vec &key anaphoric (len (length vec))) &body body)
"Executes body for all permutations of vec, which is stored in variable var.
KEYS:
anaphoric: if defined, modifies var outside the macro, preserves it otherwise
len: number of items that will be permuted, default is the full vector"
(let ((n (gensym)))
`(let ((,var ,(if anaphoric vec `(copy-seq ,vec))))
(labels ((generate (&optional (,n ,len))
(if (= ,n 1)
(progn ,#body)
(let ((n-1 (1- ,n)))
(loop for i from 0 below n-1
do (progn
(generate n-1)
(rotatef (aref ,var (if (evenp ,n) i 0))
(aref ,var n-1))))
(generate n-1)))))
(generate)))))
Finally, I tried to remove theprogn after do but it didn't work because 2 expressions have to be evaluated at that point.

Indent your code correctly:
(defmacro run-permutations (var vec &rest body)
"Executes body for all permutations of vec, which is stored in variable var"
`(let ((,var ,vec))
(labels ((generate (&optional (n (length ,var)))
(if (= n 1)
(progn ,#body)
(progn
(loop for i from 0 below (1- n)
do (progn
(generate (1- n))
(rotatef (aref ,var (if (evenp n) i 0))
(aref ,var (1- n)))))
(generate (1- n))))))
(generate))))
Use something like:
(do-permutations (v "123" :anaphoric t)
(some)
(stuff))
with a macro:
(defmacro do-permutations ((var vec &key anaphoric) &body body)
...)
other names: doing-permutations, with-permutations, ...
Note also that the body can be declared with &body, instead of &rest. The semantics is the same, but one expect it to be indented differently. &body signals that a list of Lisp forms follows.
You also don't need a progn in a loopafter do.
The body sees the variable n. You may think of another place for the body...

Related

What does gensym do in Lisp?

contextualization: I've been doing a university project in which I have to write a parser for regular expressions and build the corresponding epsilon-NFA. I have to do this in Prolog and Lisp.
I don't know if questions like this are allowed, if not I apologize.
I heard some of my classmates talking about how they used the function gensym for that, I asked them what it did and even checked up online but I literally can't understand what this function does neither why or when is best to use it.
In particular, I'm more intrested in what it does in Lisp.
Thank you all.
GENSYM creates unique symbols. Each call creates a new symbol. The symbol usually has a name which includes a number, which is counted up. The name is also unique (the symbol itself is already unique) with a number, so that a human reader can identify different uninterned symbols in the source code.
CL-USER 39 > (gensym)
#:G1083
CL-USER 40 > (gensym)
#:G1084
CL-USER 41 > (gensym)
#:G1085
CL-USER 42 > (gensym)
#:G1086
gensym is often used in Lisp macros for code generation, when the macro needs to create new identifiers, which then don't clash with existing identifiers.
Example: we are going to double the result of a Lisp form and we are making sure that the Lisp form itself will be computed only once. We do that by saving the value in a local variable. The identifier for the local variable will be computed by gensym.
CL-USER 43 > (defmacro double-it (it)
(let ((new-identifier (gensym)))
`(let ((,new-identifier ,it))
(+ ,new-identifier ,new-identifier))))
DOUBLE-IT
CL-USER 44 > (macroexpand-1 '(double-it (cos 1.4)))
(LET ((#:G1091 (COS 1.4)))
(+ #:G1091 #:G1091))
T
CL-USER 45 > (double-it (cos 1.4))
0.33993432
a little clarification of the existing answers (as the op is not yet aware of the typical common lisp macros workflow):
consider the macro double-it, proposed by mr. Joswig. Why would we bother creating this whole bunch of let? when it can be simply:
(defmacro double-it (it)
`(+ ,it ,it))
and ok, it seems to be working:
CL-USER> (double-it 1)
;;=> 2
but look at this, we want to increment x and double it
CL-USER> (let ((x 1))
(double-it (incf x)))
;;=> 5
;; WHAT? it should be 4!
the reason can be seen in macro expansion:
(let ((x 1))
(+ (setq x (+ 1 x)) (setq x (+ 1 x))))
you see, as the macro doesn't evaluate form, just splices it into generated code, it leads to incf being executed twice.
the simple solution is to bind it somewhere, and then double the result:
(defmacro double-it (it)
`(let ((x ,it))
(+ x x)))
CL-USER> (let ((x 1))
(double-it (incf x)))
;;=> 4
;; NICE!
it seems to be ok now. really it expands like this:
(let ((x 1))
(let ((x (setq x (+ 1 x))))
(+ x x)))
ok, so what about the gensym thing?
let's say, you want to print some message, before doubling your value:
(defmacro double-it (it)
`(let* ((v "DOUBLING IT")
(val ,it))
(princ v)
(+ val val)))
CL-USER> (let ((x 1))
(double-it (incf x)))
;;=> DOUBLING IT
;;=> 4
;; still ok!
but what if you accidentally name value v instead of x:
CL-USER> (let ((v 1))
(double-it (incf v)))
;;Value of V in (+ 1 V) is "DOUBLING IT", not a NUMBER.
;; [Condition of type SIMPLE-TYPE-ERROR]
It throws this weird error! Look at the expansion:
(let ((v 1))
(let* ((v "DOUBLING IT") (val (setq v (+ 1 v))))
(princ v)
(+ val val)))
it shadows the v from the outer scope with string, and when you are trying to add 1, well it obviously can't. Too bad.
another example, say you want to call the function twice, and return 2 results as a list:
(defmacro two-funcalls (f v)
`(let ((x ,f))
(list (funcall x ,v) (funcall x ,v))))
CL-USER> (let ((y 10))
(two-funcalls (lambda (z) z) y))
;;=> (10 10)
;; OK
CL-USER> (let ((x 10))
(two-funcalls (lambda (z) z) x))
;; (#<FUNCTION (LAMBDA (Z)) {52D2D4AB}> #<FUNCTION (LAMBDA (Z)) {52D2D4AB}>)
;; NOT OK!
this class of bugs is very nasty, since you can't easily say what's happened.
What is the solution? Obviously not to name the value v inside macro. You need to generate some sophisticated name that no one would reproduce in their code, like my-super-unique-value-identifier-2019-12-27. This would probably save you, but still you can't really be sure. That's why gensym is there:
(defmacro two-funcalls (f v)
(let ((fname (gensym)))
`(let ((,fname ,f))
(list (funcall ,fname ,v) (funcall ,fname ,v)))))
expanding to:
(let ((y 10))
(let ((#:g654 (lambda (z) z)))
(list (funcall #:g654 y) (funcall #:g654 y))))
you just generate the var name for the generated code, it is guaranteed to be unique (meaning no two gensym calls would generate the same name for the runtime session),
(loop repeat 3 collect (gensym))
;;=> (#:G645 #:G646 #:G647)
it still can potentially be clashed with user var somehow, but everybody knows about the naming and doesn't call the var #:GXXXX, so you can consider it to be impossible. You can further secure it, adding prefix
(loop repeat 3 collect (gensym "MY_GUID"))
;;=> (#:MY_GUID651 #:MY_GUID652 #:MY_GUID653)
GENSYM will generate a new symbol at each call. It will be garanteed, that the symbol did not exist before it will be generated and that it will never be generated again. You may specify a symbols prefix, if you like:
CL-USER> (gensym)
#:G736
CL-USER> (gensym "SOMETHING")
#:SOMETHING737
The most common use of GENSYM is generating names for items to avoid name clashes in macro expansion.
Another common purpose is the generaton of symbols for the construction of graphs, if the only thing demand you have is to attach a property list to them, while the name of the node is not of interest.
I think, the task of NFA-generation could make good use of the second purpose.
This is a note to some of the other answers, which I think are fine. While gensym is the traditional way of making new symbols, in fact there is another way which works perfectly well and is often better I find: make-symbol:
make-symbol creates and returns a fresh, uninterned symbol whose name is the given name. The new-symbol is neither bound nor fbound and has a null property list.
So, the nice thing about make-symbol is it makes a symbol with the name you asked for, exactly, without any weird numerical suffix. This can be helpful when writing macros because it makes the macroexpansion more readable. Consider this simple list-collection macro:
(defmacro collecting (&body forms)
(let ((resultsn (make-symbol "RESULTS"))
(rtailn (make-symbol "RTAIL")))
`(let ((,resultsn '())
(,rtailn nil))
(flet ((collect (it)
(let ((new (list it)))
(if (null ,rtailn)
(setf ,resultsn new
,rtailn new)
(setf (cdr ,rtailn) new
,rtailn new)))
it))
,#forms
,resultsn))))
This needs two bindings which the body can't refer to, for the results, and the last cons of the results. It also introduces a function in a way which is intentionally 'unhygienic': inside collecting, collect means 'collect something'.
So now
> (collecting (collect 1) (collect 2) 3)
(1 2)
as we want, and we can look at the macroexpansion to see that the introduced bindings have names which make some kind of sense:
> (macroexpand '(collecting (collect 1)))
(let ((#:results 'nil) (#:rtail nil))
(flet ((collect (it)
(let ((new (list it)))
(if (null #:rtail)
(setf #:results new #:rtail new)
(setf (cdr #:rtail) new #:rtail new)))
it))
(collect 1)
#:results))
t
And we can persuade the Lisp printer to tell us that in fact all these uninterned symbols are the same:
> (let ((*print-circle* t))
(pprint (macroexpand '(collecting (collect 1)))))
(let ((#2=#:results 'nil) (#1=#:rtail nil))
(flet ((collect (it)
(let ((new (list it)))
(if (null #1#)
(setf #2# new #1# new)
(setf (cdr #1#) new #1# new)))
it))
(collect 1)
#2#))
So, for writing macros I generally find make-symbol more useful than gensym. For writing things where I just need a symbol as an object, such as naming a node in some structure, then gensym is probably more useful. Finally note that gensym can be implemented in terms of make-symbol:
(defun my-gensym (&optional (thing "G"))
;; I think this is GENSYM
(check-type thing (or string (integer 0)))
(let ((prefix (typecase thing
(string thing)
(t "G")))
(count (typecase thing
((integer 0) thing)
(t (prog1 *gensym-counter*
(incf *gensym-counter*))))))
(make-symbol (format nil "~A~D" prefix count))))
(This may be buggy.)

Macros That Write Macros - Compile Error

When I compile the following code, SBCL complains that g!-unit-value and g!-unit are undefined. I'm not sure how to debug this. As far as I can tell, flatten is failing.
When flatten reaches the unquoted part of defunits, it seems like the entire part is being treated as an atom. Does that sound correct?
The following uses code from the book Let over Lambda:
Paul Graham Utilities
(defun symb (&rest args)
(values (intern (apply #'mkstr args))))
(defun mkstr (&rest args)
(with-output-to-string (s)
(dolist (a args) (princ a s))))
(defun group (source n)
(if (zerop n) (error "zero length"))
(labels ((rec (source acc)
(let ((rest (nthcdr n source)))
(if (consp rest)
(rec rest (cons (subseq source 0 n) acc))
(nreverse (cons source acc))))))
(if source (rec source nil) nil)))
(defun flatten (x)
(labels ((rec (x acc)
(cond ((null x) acc)
((atom x) (cons x acc))
(t (rec (car x) (rec (cdr x) acc))))))
(rec x nil)))
Let Over Lambda Utilities - Chapter 3
(defmacro defmacro/g! (name args &rest body)
(let ((g!-symbols (remove-duplicates
(remove-if-not #'g!-symbol-p
(flatten body)))))
`(defmacro ,name ,args
(let ,(mapcar
(lambda (g!-symbol)
`(,g!-symbol (gensym ,(subseq
(symbol-name g!-symbol)
2))))
g!-symbols)
,#body))))
(defun g!-symbol-p (symbol-to-test)
(and (symbolp symbol-to-test)
(> (length (symbol-name symbol-to-test)) 2)
(string= (symbol-name symbol-to-test)
"G!"
:start1 0
:end1 2)))
(defmacro defmacro! (name args &rest body)
(let* ((o!-symbols (remove-if-not #'o!-symbol-p args))
(g!-symbols (mapcar #'o!-symbol-to-g!-symbol o!-symbols)))
`(defmacro/g! ,name ,args
`(let ,(mapcar #'list (list ,#g!-symbols) (list ,#o!-symbols))
,(progn ,#body)))))
(defun o!-symbol-p (symbol-to-test)
(and (symbolp symbol-to-test)
(> (length (symbol-name symbol-to-test)) 2)
(string= (symbol-name symbol-to-test)
"O!"
:start1 0
:end1 2)))
(defun o!-symbol-to-g!-symbol (o!-symbol)
(symb "G!" (subseq (symbol-name o!-symbol) 2)))
Let Over Lambda - Chapter 5
(defun defunits-chaining (u units prev)
(if (member u prev)
(error "~{ ~a~^ depends on~}"
(cons u prev)))
(let ((spec (find u units :key #'car)))
(if (null spec)
(error "Unknown unit ~a" u)
(let ((chain (second spec)))
(if (listp chain)
(* (car chain)
(defunits-chaining
(second chain)
units
(cons u prev)))
chain)))))
(defmacro! defunits (quantity base-unit &rest units)
`(defmacro ,(symb 'unit-of- quantity)
(,g!-unit-value ,g!-unit)
`(* ,,g!-unit-value
,(case ,g!-unit
((,base-unit) 1)
,#(mapcar (lambda (x)
`((,(car x))
,(defunits-chaining
(car x)
(cons
`(,base-unit 1)
(group units 2))
nil)))
(group units 2))))))
This is kind of tricky:
Problem: you assume that backquote/comma expressions are plain lists.
You need to ask yourself this question:
What is the representation of a backquote/comma expression?
Is it a list?
Actually the full representation is unspecified. See here: CLHS: Section 2.4.6.1 Notes about Backquote
We are using SBCL. See this:
* (setf *print-pretty* nil)
NIL
* '`(a ,b)
(SB-INT:QUASIQUOTE (A #S(SB-IMPL::COMMA :EXPR B :KIND 0)))
So a comma expression is represented by a structure of type SB-IMPL::COMMA. The SBCL developers thought that this representation helps when such backquote lists need to be printed by the pretty printer.
Since your flatten treats structures as atoms, it won't look inside...
But this is the specific representation of SBCL. Clozure CL does something else and LispWorks again does something else.
Clozure CL:
? '`(a ,b)
(LIST* 'A (LIST B))
LispWorks:
CL-USER 87 > '`(a ,b)
(SYSTEM::BQ-LIST (QUOTE A) B)
Debugging
Since you found out that somehow flatten was involved, the next debugging steps are:
First: trace the function flatten and see with which data it is called and what it returns.
Since we are not sure what the data actually is, one can INSPECT it.
A debugging example using SBCL:
* (defun flatten (x)
(inspect x)
(labels ((rec (x acc)
(cond ((null x) acc)
((atom x) (cons x acc))
(t (rec (car x) (rec (cdr x) acc))))))
(rec x nil)))
STYLE-WARNING: redefining COMMON-LISP-USER::FLATTEN in DEFUN
FLATTEN
Above calls INSPECT on the argument data. In Common Lisp, the Inspector usually is something where one can interactively inspect data structures.
As an example we are calling flatten with a backquote expression:
* (flatten '`(a ,b))
The object is a proper list of length 2.
0. 0: SB-INT:QUASIQUOTE
1. 1: (A ,B)
We are in the interactive Inspector. The commands now available:
> help
help for INSPECT:
Q, E - Quit the inspector.
<integer> - Inspect the numbered slot.
R - Redisplay current inspected object.
U - Move upward/backward to previous inspected object.
?, H, Help - Show this help.
<other> - Evaluate the input as an expression.
Within the inspector, the special variable SB-EXT:*INSPECTED* is bound
to the current inspected object, so that it can be referred to in
evaluated expressions.
So the command 1 walks into the data structure, here a list.
> 1
The object is a proper list of length 2.
0. 0: A
1. 1: ,B
Walk in further:
> 1
The object is a STRUCTURE-OBJECT of type SB-IMPL::COMMA.
0. EXPR: B
1. KIND: 0
Here the Inspector tells us that the object is a structure of a certain type. That's what we wanted to know.
We now leave the Inspector using the command q and the flatten function continues and returns a value:
> q
(SB-INT:QUASIQUOTE A ,B)
For anyone else who is trying to get defmacro! to work on SBCL, a temporary solution to this problem is to grope inside the unquote structure during the flatten procedure recursively flatten its contents:
(defun flatten (x)
(labels ((flatten-recursively (x flattening-list)
(cond ((null x) flattening-list)
((eq (type-of x) 'SB-IMPL::COMMA) (flatten-recursively (sb-impl::comma-expr x) flattening-list))
((atom x) (cons x flattening-list))
(t (flatten-recursively (car x) (flatten-recursively (cdr x) flattening-list))))))
(flatten-recursively x nil)))
But this is horribly platform dependant. If I find a better way, I'll post it.
In case anyone's still interested in this one, here are my three cents. My objection to the above modification of flatten is that it might be more naturally useful as it were originally, while the problem with representations of unquote is rather endemic to defmacro/g!. I came up with a not-too-pretty modification of defmacro/g! using features to decide what to do. Namely, when dealing with non-SBCL implementations (#-sbcl) we proceed as before, while in the case of SBCL (#+sbcl) we dig into the sb-impl::comma structure, use its expr attribute when necessary and use equalp in remove-duplicates, as we are now dealing with structures, not symbols. Here's the code:
(defmacro defmacro/g! (name args &rest body)
(let ((syms (remove-duplicates
(remove-if-not #-sbcl #'g!-symbol-p
#+sbcl #'(lambda (s)
(and (sb-impl::comma-p s)
(g!-symbol-p (sb-impl::comma-expr s))))
(flatten body))
:test #-sbcl #'eql #+sbcl #'equalp)))
`(defmacro ,name ,args
(let ,(mapcar
(lambda (s)
`(#-sbcl ,s #+sbcl ,(sb-impl::comma-expr s)
(gensym ,(subseq
#-sbcl
(symbol-name s)
#+sbcl
(symbol-name (sb-impl::comma-expr s))
2))))
syms)
,#body))))
It works with SBCL. I have yet to test it thoroughly on other implementations.

expanding a parameter list in a common lisp macro

I'm trying to teach myself common lisp, and as an exercise in macro-writing, I'm trying to create a a macro to define a nested-do loop of arbitrary depth. I'm working with sbcl, using emacs and slime.
To start, I wrote this double-loop macro:
(defmacro nested-do-2 (ii jj start end &body body)
`(do ((,ii ,start (1+ ,ii)))
((> ,ii ,end))
(do ((,jj ,ii (1+ ,jj)))
((> ,jj ,end))
,#body)))
which I could then use as follows:
(nested-do-2 ii jj 10 20 (print (+ ii jj)))
BTW, I originally wrote this macro using gensym to generate the loop counters (ii, jj), but then I realized that the macro was pretty useless if I couldn't access the counters in the body.
Anyway, I would like to generalize the macro to create a nested-do loop that would be nested to an arbitrary level. This is what I've got so far, but it doesn't quite work:
(defmacro nested-do ((&rest indices) start end &body body)
`(dolist ((index ,indices))
(do ((index ,start (1+ index)))
((> index ,end))
(if (eql index (elt ,indices (elt (reverse ,indices) 0)))
,#body))))
which I would like to invoke as follows:
(nested-do (ii jj kk) 10 15 (print (+ ii jj kk)))
However, the list is not being expanded properly, and I end up in the debugger with this error:
error while parsing arguments to DEFMACRO DOLIST:
invalid number of elements in
((INDEX (II JJ KK)))
And in case it's not obvious, the point of the embedded if statement is to execute the body only in the innermost loop. That doesn't seem terribly elegant to me, and it's not really tested (since I haven't been able to expand the parameter list yet), but it's not really the point of this question.
How can I expand the list properly within the macro? Is the problem in the macro syntax, or in the expression of the list in the function call? Any other comments will also be appreciated.
Thanks in advance.
Here's one way to do it - build the structure from the bottom (loop body) up each index:
(defmacro nested-do ((&rest indices) start end &body body)
(let ((rez `(progn ,#body)))
(dolist (index (reverse indices) rez)
(setf rez
`(do ((,index ,start (1+ ,index)))
((> ,index ,end))
,rez)))))
[Aside from the down votes, this actually works and it is beautiful too!]
Just to clearly illustrate the recursive nature of the macro definition, here is a Scheme implementation:
(define-syntax nested-do
(syntax-rules ()
((_ ((index start end)) body)
(do ((index start (+ 1 index)))
((= index end))
body))
((_ ((index start end) rest ...) body)
(do ((index start (+ 1 index)))
((= index end))
(nested-do (rest ...) body)))))
Using the above, as a template, something like this gets it done:
(defmacro nested-do ((&rest indices) start end &body body)
(let ((index (car indices)))
`(do ((,index ,start (1+ ,index)))
((> ,index ,end))
,(if (null (cdr indices))
`(progn ,#body)
`(nested-do (,#(cdr indices)) ,start ,end ,#body)))))
* (nested-do (i j) 0 2 (print (list i j)))
(0 0)
(0 1)
(0 2)
(1 0)
(1 1)
(1 2)
(2 0)
(2 1)
(2 2)
NIL
Note that with all Common-Lisp macros you'll need to use the 'gensym' patterns to avoid variable capture.

What is wrong with the following Common Lisp macro using gensym?

Learning Common Lisp (using GNU CLISP 2.43) .. so might be a noob mistake. Example is the 'print prime numbers between x and y'
(defun is-prime (n)
(if (< n 2) (return-from is-prime NIL))
(do ((i 2 (1+ i)))
((= i n) T)
(if (= (mod n i) 0)
(return NIL))))
(defun next-prime-after (n)
(do ((i (1+ n) (1+ i)))
((is-prime i) i)))
(defmacro do-primes-v2 ((var start end) &body body)
`(do ((,var (if (is-prime ,start)
,start
(next-prime-after ,start))
(next-prime-after ,var)))
((> ,var ,end))
,#body))
(defmacro do-primes-v3 ((var start end) &body body)
(let ((loop-start (gensym))
(loop-end (gensym)))
`(do ((,loop-start ,start)
(,loop-end ,end)
(,var (if (is-prime ,loop-start)
,loop-start
(next-prime-after ,loop-start))
(next-prime-after ,var)))
((> ,var ,loop-end))
,#body )))
do-primes-v2 works perfectly.
[13]> (do-primes-v2 (p 10 25) (format t "~d " p))
11 13 17 19 23
Next I tried using gensym to avoid naming clashes in macro expansion - do-primes-v3. However I'm stuck with a
*** - EVAL: variable #:G3498 has no value
Tried using macro-expand to see if i could spot the mistake but I can't.
[16]> (macroexpand-1 `(do-primes-v3 (p 10 25) (format t "~d " p)))
(DO
((#:G3502 10) (#:G3503 25)
(P (IF (IS-PRIME #:G3502) #:G3502 (NEXT-PRIME-AFTER #:G3502))
(NEXT-PRIME-AFTER P)))
((> P #:G3503)) (FORMAT T "~d " P)) ;
Use DO* instead of DO.
DO Initializes the bindings in a scope where they are not yet visible. DO* initializes the bindings in a scope where they are visible.
In this particular case var needs to reference the other binding loop-start.
You don't actually need the gensym here for avoiding variable capture, because you do not introduce any variables that would be "local to the macro". When you macroexpand your do-primes-v2, you will see that no variable is introduced that didn't exist outside of the macro.
You do need it for a different thing, though: avoiding multiple evaluation.
If you call the macro like this:
(do-primes-v2 (p (* x 2) (* y 3))
(format "~a~%" p))
it expands to
(do ((p (if (is-prime (* x 2))
(* x 2)
(next-prime-after (* x 2))
(next-prime-after p)))
((> p (* y 3))
(format "~a~%" p))
At best, this is inefficient, because those multiplications are done multiple times. However, if you use a function with side effects as inputs, like setf or incf, this can be a big problem.
Either move the binding of your loop-start and loop-end to an enclosing LET block or use DO*. The reason is that all loop variables in DO are bound "in parallel", so for the first binding, the (expanded) loop-start variable does not yet have a binding.
I know this doesn't really answer your question, but I do think it is relevant. In my experience, the type of macro you are attempting to write is a very common one. One problem I have with the way you have approached the problem is that it doesn't handle another common use case: functional composition.
I don't have the time to highlight some of the difficulties you will probably encounter using your macro, I will however highlight that, had you built your prime iterator geared towards functional composition, your macro turns out to be extremely simple, avoiding your question altogether.
Note: I have slightly modified some of your functions.
(defun is-prime (n)
(cond
((< n 2)
nil)
((= n 2)
t)
((evenp n)
nil)
(t
(do ((i 2 (1+ i)))
((= i n) t)
(when (or (= (mod n i) 0))
(return nil))))))
(defun next-prime (n)
(do ((i n (1+ i)))
((is-prime i) i)))
(defun prime-iterator (start-at)
(let ((current start-at))
(lambda ()
(let ((next-prime (next-prime current)))
(setf current (1+ next-prime))
next-prime))))
(defun map-primes/iterator (fn iterator end)
(do ((i (funcall iterator) (funcall iterator)))
((>= i end) nil)
(funcall fn i)))
(defun map-primes (fn start end)
(let ((iterator (prime-iterator start)))
(map-primes/iterator fn iterator end)))
(defmacro do-primes ((var start end) &body body)
`(map-primes #'(lambda (,var)
,#body)
,start ,end))
I too recommend that you look at Series. The generator pattern is also a very common occurrence in lisp programs. You may also want to look at Alexandria, in particular the function ALEXANDRIA:COMPOSE to see what cool stuff you can do with functional composition.
I suggest avoiding DO/DO* and macros altogether and instead going for Series (an implementation of which can be found on series.sourceforge.net).
If that's too complex then consider just generating a list of primes with recursion or a generator (for on-demand generation).

How do I memoize a recursive function in Lisp?

I'm a Lisp beginner. I'm trying to memoize a recursive function for calculating the number of terms in a Collatz sequence (for problem 14 in Project Euler). My code as of yet is:
(defun collatz-steps (n)
(if (= 1 n) 0
(if (evenp n)
(1+ (collatz-steps (/ n 2)))
(1+ (collatz-steps (1+ (* 3 n)))))))
(defun p14 ()
(defvar m-collatz-steps (memoize #'collatz-steps))
(let
((maxsteps (funcall m-collatz-steps 2))
(n 2)
(steps))
(loop for i from 1 to 1000000
do
(setq steps (funcall m-collatz-steps i))
(cond
((> steps maxsteps)
(setq maxsteps steps)
(setq n i))
(t ())))
n))
(defun memoize (fn)
(let ((cache (make-hash-table :test #'equal)))
#'(lambda (&rest args)
(multiple-value-bind
(result exists)
(gethash args cache)
(if exists
result
(setf (gethash args cache)
(apply fn args)))))))
The memoize function is the same as the one given in the On Lisp book.
This code doesn't actually give any speedup compared to the non-memoized version. I believe it's due to the recursive calls calling the non-memoized version of the function, which sort of defeats the purpose. In that case, what is the correct way to do the memoization here? Is there any way to have all calls to the original function call the memoized version itself, removing the need for the special m-collatz-steps symbol?
EDIT: Corrected the code to have
(defvar m-collatz-steps (memoize #'collatz-steps))
which is what I had in my code.
Before the edit I had erroneously put:
(defvar collatz-steps (memoize #'collatz-steps))
Seeing that error gave me another idea, and I tried using this last defvar itself and changing the recursive calls to
(1+ (funcall collatz-steps (/ n 2)))
(1+ (funcall collatz-steps (1+ (* 3 n))))
This does seem to perform the memoization (speedup from about 60 seconds to 1.5 seconds), but requires changing the original function. Is there a cleaner solution which doesn't involve changing the original function?
I assume you're using Common-Lisp, which has separate namespaces for variable and function names. In order to memoize the function named by a symbol, you need to change its function binding, through the accessor `fdefinition':
(setf (fdefinition 'collatz-steps) (memoize #'collatz-steps))
(defun p14 ()
(let ((mx 0) (my 0))
(loop for x from 1 to 1000000
for y = (collatz-steps x)
when (< my y) do (setf my y mx x))
mx))
Here is a memoize function that rebinds the symbol function:
(defun memoize-function (function-name)
(setf (symbol-function function-name)
(let ((cache (make-hash-table :test #'equal)))
#'(lambda (&rest args)
(multiple-value-bind
(result exists)
(gethash args cache)
(if exists
result
(setf (gethash args cache)
(apply fn args)))))))
You would then do something like this:
(defun collatz-steps (n)
(if (= 1 n) 0
(if (evenp n)
(1+ (collatz-steps (/ n 2)))
(1+ (collatz-steps (1+ (* 3 n)))))))
(memoize-function 'collatz-steps)
I'll leave it up to you to make an unmemoize-function.
something like this:
(setf collatz-steps (memoize lambda (n)
(if (= 1 n) 0
(if (evenp n)
(1+ (collatz-steps (/ n 2)))
(1+ (collatz-steps (1+ (* 3 n))))))))
IOW: your original (non-memoized) function is anonymous, and you only give a name to the result of memoizing it.
Note a few things:
(defun foo (bar)
... (foo 3) ...)
Above is a function that has a call to itself.
In Common Lisp the file compiler can assume that FOO does not change. It will NOT call an updated FOO later. If you change the function binding of FOO, then the call of the original function will still go to the old function.
So memoizing a self recursive function will NOT work in the general case. Especially not if you are using a good compiler.
You can work around it to go always through the symbol for example: (funcall 'foo 3)
(DEFVAR ...) is a top-level form. Don't use it inside functions. If you have declared a variable, set it with SETQ or SETF later.
For your problem, I'd just use a hash table to store the intermediate results.
Changing the "original" function is necessary, because, as you say, there's no other way for the recursive call(s) to be updated to call the memoized version.
Fortunately, the way lisp works is to find the function by name each time it needs to be called. This means that it is sufficient to replace the function binding with the memoized version of the function, so that recursive calls will automatically look up and reenter through the memoization.
huaiyuan's code shows the key step:
(setf (fdefinition 'collatz-steps) (memoize #'collatz-steps))
This trick also works in Perl. In a language like C, however, a memoized version of a function must be coded separately.
Some lisp implementations provide a system called "advice", which provides a standardized structure for replacing functions with enhanced versions of themselves. In addition to functional upgrades like memoization, this can be extremely useful in debugging by inserting debug prints (or completely stopping and giving a continuable prompt) without modifying the original code.
This function is exactly the one Peter Norvig gives as an example of a function that seems like a good candidate for memoization, but which is not.
See figure 3 (the function 'Hailstone') of his original paper on memoization ("Using Automatic Memoization as a Software Engineering Tool in Real-World AI Systems").
So I'm guessing, even if you get the mechanics of memoization working, it won't really speed it up in this case.
A while ago I wrote a little memoization routine for Scheme that used a chain of closures to keep track of the memoized state:
(define (memoize op)
(letrec ((get (lambda (key) (list #f)))
(set (lambda (key item)
(let ((old-get get))
(set! get (lambda (new-key)
(if (equal? key new-key) (cons #t item)
(old-get new-key))))))))
(lambda args
(let ((ans (get args)))
(if (car ans) (cdr ans)
(let ((new-ans (apply op args)))
(set args new-ans)
new-ans))))))
This needs to be used like so:
(define fib (memoize (lambda (x)
(if (< x 2) x
(+ (fib (- x 1)) (fib (- x 2)))))))
I'm sure that this can be ported to your favorite lexically scoped Lisp flavor with ease.
I'd probably do something like:
(let ((memo (make-hash-table :test #'equal)))
(defun collatz-steps (n)
(or (gethash n memo)
(setf (gethash n memo)
(cond ((= n 1) 0)
((oddp n) (1+ (collatz-steps (+ 1 n n n))))
(t (1+ (collatz-steps (/ n 2)))))))))
It's not Nice and Functional, but, then, it's not much hassle and it does work. Downside is that you don't get a handy unmemoized version to test with and clearing the cache is bordering on "very difficult".