Why isn't let* the default let? - emacs

As probably all experienced elispers have found at some point, code like is broken:
(let ((a 3)
(b 4)
(c (+ a b)))
c)
One should use the let* form instead when referring to a just-binded variable within the binding clauses.
I just wonder - why is a seemingly wrong behavior the default? Are there any risks on choosing always let* regardless of how is one gonna use it?

Any let* forms can be easily (and mechanically) expressed using let. For example, if we didn't have the let* special form, we could express
(let* ((a 3)
(b 4)
(c (+ a b)))
c)
as:
(let ((a 3))
(let ((b 4))
(let ((c (+ a b)))
c)))
On the other hand, some let forms are quite difficult, if possible at all, to express using let*. The following let form cannot be expressed using let* without introducing an additional temporary variable or esoteric bitwise manipulation.
(let ((x y) (y x)) ; swap x and y
...)
So, in this sense, I think let is more fundamental than let*.

I was taught that the reason is mainly historical, but it might still hold: Since let does parallel assigment, having no data dependencies between the variables might give the compiler flexibility compile for speed or space and compile to something much faster than let*. I don't know how much the elisp compiler uses this.
A bit of googling reveals a similar question for Common Lisp: LET versus LET* in Common Lisp

it is incorrect to do so
(let ((a 10)
(b a)) <- error
b)
it is correct to do so:
(let* ((a 10)
(b a))
b) <- will evaluate to 10.
The error in the first case is due to semantics of let.
let* adds new symbols to the existing environment, evaluating them inside it.
let creates a new environment, evaluating the new symbols in the current envinronemnt, and the new environment will be disponible at the end of the evaluation of all new symbols, to evaluate the code of (let () code ).
let* is syntactic sugar for
(let ((a xxx))
(let ((b a)) ...)
while let is syntactic sugar for
((lambda (a)
...)
val_a)
The second form is much more common, so perhaps they thought to give it a shorter name...

I'm not sure I would call let broken. For instance, you may for whatever reason want to shadow a and b in the enclosed environment, while defining c with respect to the enclosing expression.
(let ((a 11)
(b 12))
(let ((a 3)
(b 4)
(c (+ a b)))
;; do stuff here will shadowed `a' and `b'
c))
> 23
As for risks, I don't know if there are any per se. But why created multiple nested environments when you don't have to? There may be a performance penalty for doing so (I don't have enough experience with elisp to say).

Related

Common Lisp: Any way to avoid defvar or defparameter?

I'm using SBCL 2.0.1.debian and Paul Graham's ANSI Common Lisp to learn Lisp.
Right in Chapter 2 though, I'm realizing that I cannot use setf like the author can! A little googling and I learn that I must use defvar or defparameter to 'introduce' my globals before I can set them with setq!
Is there any way to avoid having to introduce globals via the defvar or defparameter, either from inside SBCL or from outside via switches? Do other Lisp's too mandate this?
I understand their value-add in large codebases but right now I'm just learning by writing smallish programs, and so am finding them cumbersome. I'm used to using globals in other languages, so don't necessarily mind global- / local-variable bugs.
If you are writing programs, in the sense of things which have some persistent existence in files, then use the def* forms. Top-level setf / setq of an undefined variable has undefined semantics in CL and, even worse, has differing semantics across implementations. Typing defvar, defparameter or defconstant is not much harder than typing setf or setq and means your programs will have defined semantics, which is usually considered a good thing. So no, for programs there is no way to avoid using the def* forms, or some equivalent thereof.
If you are simply typing things at a REPL / listener to play with things, then I think just using setf at top-level is fine (no-one uses environments where things typed at the REPL are really persistent any more I think).
You say you are used to using globals in other languages. Depending on what those other languages are this quite probably means you're not used to CL's semantics for bindings defined with def* forms, which are not only global, but globally special, or globally dynamic. I don't know which other languages even have CL's special / lexical distinction, but I suspect that not that many do. For instance consider this Python (3) program:
x = 1
def foo():
x = 0
print(x)
def bar():
nonlocal x
x += 1
return x
return bar
y = foo()
print(y())
print(y())
print(x)
If you run this it will print
0
1
2
1
So consider the 'equivalent' CL program (note: never write a program with variables named like this):
(defvar x 1)
(defun foo ()
(let ((x 0))
(print x)
(lambda ()
(incf x))))
(let ((y (foo)))
(print (funcall y))
(print (funcall y))
(print x)
(values))
If you run this it will print
0
2
3
3
Which I think you can agree is very different from what the Python program printed.
That's because top-level variables in CL are special, or dynamic variables: in the above CL program x is dynamically scoped because x has been declared globally special by the defvar, and this means that the calls to the function returned by foo are altering the global value of x.
Python doesn't have dynamic bindings at all (but you can write Python modules which will simulate them as functions which access an explicit binding stack, with a bit of deviousness). Something like this is also how other languages expose dynamic bindings. For instance this is how Racket does it:
(define d (make-parameter 1))
(define (foo)
(parameterize ([d 0])
(display (d))
(λ ()
(d (+ (d) 1))
(d))))
(let ([y (foo)])
(display (y))
(display (y))
(display (d)))
will print
0
2
3
3
While this program
(define x 1)
(define (foo)
(let ([x 0])
(displayln x)
(λ ()
(set! x (+ x 1))
x)))
(let ([y (foo)])
(displayln (y))
(displayln (y))
(displayln x))
will print
0
1
2
1
as the Python one does.
If it wasn't for CL's compatibility requirements this is perhaps how CL should have done this too (that's obviously a personal opinion, and I'm happy with how CL does do it).
CL doesn't have top-level (global) lexical bindings at all (but you can write a CL program which will simulate them in a pretty convincing way using symbol macros). If you want such things I suspect there are already some on the internet or I could get on and tidy up my implementation. As an example of a program which uses such a thing:
(defglex x 1)
(defun foo ()
(let ((x 0))
(print x)
(lambda ()
(incf x))))
(let ((y (foo)))
(print (funcall y))
(print (funcall y))
(print x))
will print
0
1
2
1
CL has both lexical and dynamic (special) nonglobal bindings.
As a note: the fact that things defined with the def* forms are globally special / dynamic, which means that all bindings of them are dynamic, is why you should always distinguish the names of such variables, using the * convention: (defvar *my-var* ...) and never (defvar my-var ...). ((defconstant my-constant ...) is OK however.)

What is the difference between define vs define-syntax and let vs let-syntax when value is syntax-rules?

I'm in a process of implementing Hygienic macros in my Scheme implementation, I've just implemented syntax-rules, but I have this code:
(define odd?
(syntax-rules ()
((_ x) (not (even? x)))))
what should be the difference between that and this:
(define-syntax odd?
(syntax-rules ()
((_ x) (not (even? x)))))
from what I understand syntax-rules just return syntax transformer, why you can't just use define to assign that to symbol? Why I need to use define-syntax? What extra stuff that expression do?
Should first also work in scheme? Or only the second one?
Also what is the difference between let vs let-syntax and letrec vs letrec-syntax. Should (define|let|letrec)-syntax just typecheck if the value is syntax transformer?
EDIT:
I have this implementation, still using lisp macros:
;; -----------------------------------------------------------------------------
(define-macro (let-syntax vars . body)
`(let ,vars
,#(map (lambda (rule)
`(typecheck "let-syntax" ,(car rule) "syntax"))
vars)
,#body))
;; -----------------------------------------------------------------------------
(define-macro (letrec-syntax vars . body)
`(letrec ,vars
,#(map (lambda (rule)
`(typecheck "letrec-syntax" ,(car rule) "syntax"))
vars)
,#body))
;; -----------------------------------------------------------------------------
(define-macro (define-syntax name expr)
(let ((expr-name (gensym)))
`(define ,name
(let ((,expr-name ,expr))
(typecheck "define-syntax" ,expr-name "syntax")
,expr-name))))
This this code correct?
Should this code works?
(let ((let (lambda (x) x)))
(let-syntax ((odd? (syntax-rules ()
((_ x) (not (even? x))))))
(odd? 11)))
This question seems to imply some deep confusion about macros.
Let's imagine a language where syntax-rules returns some syntax transformer function (I am not sure this has to be true in RnRS Scheme, it is true in Racket I think), and where let and let-syntax were the same.
So let's write this function:
(define (f v)
(let ([g v])
(g e (i 10)
(if (= i 0)
i
(e (- i 1))))))
Which we can turn into this, of course:
(define (f v n)
(v e (i n)
(if (<= i 0)
i
(e (- i 1)))))
And I will tell you in addition that there is no binding for e or i in the environment.
What is the interpreter meant to do with this definition? Could it compile it? Could it safely infer that i can't possibly make any sense since it is used as a function and then as a number? Can it safely do anything at all?
The answer is that no, it can't. Until it knows what the argument to the function is it can't do anything. And this means that each time f is called it has to make that decision again. In particular, v might be:
(syntax-rules ()
[(_ name (var init) form ...)
(letrec ([name (λ (var)
form ...)])
(name init))]))
Under which the definition of f does make some kind of sense.
And things get worse: much worse. How about this?
(define (f v1 v2 n)
(let ([v v1])
(v e (i n)
...
(set! v (if (eq? v v1) v2 v1))
...)))
What this means is that a system like this wouldn't know what the code it was meant to interpret meant until, the moment it was interpreting it, or even after that point, as you can see from the second function above.
So instead of this horror, Lisps do something sane: they divide the process of evaluating bits of code into phases where each phase happens, conceptually, before the next one.
Here's a sequence for some imagined Lisp (this is kind of close to what CL does, since most of my knowledge is of that, but it is not intended to represent any particular system):
there's a phase where the code is turned from some sequence of characters to some object, possibly with the assistance of user-defined code;
there's a phase where that object is rewritten into some other object by user- and system-defined code (macros) – the result of this phase is something which is expressed in terms of functions and some small number of primitive special things, traditionally called 'special forms' which are known to the processes of stage 3 and 4;
there may be a phase where the object from phase 2 is compiled, and that phase may involve another set of user-defined macros (compiler macros);
there is a phase where the resulting code is evaluated.
And for each unit of code these phases happen in order, each phase completes before the next one begins.
This means that each phase in which the user can intervene needs its own set of defining and binding forms: it needs to be possible to say that 'this thing controls what happens at phase 2' for instance.
That's what define-syntax, let-syntax &c do: they say that 'these bindings and definitions control what happens at phase 2'. You can't, for instance, use define or let to do that, because at phase 2, these operations don't yet have meaning: they gain meaning (possibly by themselves being macros which expand to some primitive thing) only at phase 3. At phase 2 they are just bits of syntax which the macro is ingesting and spitting out.

Can destructuring-setq be defined using destructuring-bind?

There is destructuring-bind but it seems there is no destructuring-setq. Is it possible to define it using destructuring-bind?
(let (a b c d)
(destructuring-setq ((a b) (c d)) '((1 2) (3 4)))
`(,b ,d))
(destructuring-bind
((a b) (c d)) '((1 2) (3 4))
`(,b ,d))
This would be a highly nontrivial endeavor.
What you would have to do is write a lambda-list analyzer which would
Find all variables to be bound
Replace them with gensyms (or use copy-symbol for total unreadability of the macroexpansion :-) and keep a map from the old symbols to the new ones.
Return something like
(destructuring-bind (new-lambda-list)
expression
(setq old-var-1 new-gensym-1 ...))
The analyser is present in any Common Lisp implementation (see, e.g., the link above) and it is not simple.
I suggest that you ask yourself whether destructuring-bind is really not enough.

SICP: Can or be defined in lisp as a syntactic transformation without gensym?

I am trying to solve the last part of question 4.4 of the Structure and Interpretation of computer programming; the task is to implement or as a syntactic transformation. Only elementary syntactic forms are defined; quote, if, begin, cond, define, apply and lambda.
(or a b ... c) is equal to the first true value or false if no value is true.
The way I want to approach it is to transform for example (or a b c) into
(if a a (if b b (if c c false)))
the problem with this is that a, b, and c would be evaluated twice, which could give incorrect results if any of them had side-effects. So I want something like a let
(let ((syma a))
(if syma syma (let ((symb b))
(if symb symb (let ((symc c))
(if (symc symc false)) )) )) )
and this in turn could be implemented via lambda as in Exercise 4.6. The problem now is determining symbols syma, symb and symc; if for example the expression b contains a reference to the variable syma, then the let will destroy the binding. Thus we must have that syma is a symbol not in b or c.
Now we hit a snag; the only way I can see out of this hole is to have symbols that cannot have been in any expression passed to eval. (This includes symbols that might have been passed in by other syntactic transformations).
However because I don't have direct access to the environment at the expression I'm not sure if there is any reasonable way of producing such symbols; I think Common Lisp has the function gensym for this purpose (which would mean sticking state in the metacircular interpreter, endangering any concurrent use).
Am I missing something? Is there a way to implement or without using gensym? I know that Scheme has it's own hygenic macro system, but I haven't grokked how it works and I'm not sure whether it's got a gensym underneath.
I think what you might want to do here is to transform to a syntactic expansion where the evaluation of the various forms aren't nested. You could do this, e.g., by wrapping each form as a lambda function and then the approach that you're using is fine. E.g., you can do turn something like
(or a b c)
into
(let ((l1 (lambda () a))
(l2 (lambda () b))
(l3 (lambda () c)))
(let ((v1 (l1)))
(if v1 v1
(let ((v2 (l2)))
(if v2 v2
(let ((v3 (l3)))
(if v3 v3
false)))))))
(Actually, the evaluation of the lambda function calls are still nested in the ifs and lets, but the definition of the lambda functions are in a location such that calling them in the nested ifs and lets doesn't cause any difficulty with captured bindings.) This doesn't address the issue of how you get the variables l1–l3 and v1–v3, but that doesn't matter so much, none of them are in scope for the bodies of the lambda functions, so you don't need to worry about whether they appear in the body or not. In fact, you can use the same variable for all the results:
(let ((l1 (lambda () a))
(l2 (lambda () b))
(l3 (lambda () c)))
(let ((v (l1)))
(if v v
(let ((v (l2)))
(if v v
(let ((v (l3)))
(if v v
false)))))))
At this point, you're really just doing loop unrolling of a more general form like:
(define (functional-or . functions)
(if (null? functions)
false
(let ((v ((first functions))))
(if v v
(functional-or (rest functions))))))
and the expansion of (or a b c) is simply
(functional-or (lambda () a) (lambda () b) (lambda () c))
This approach is also used in an answer to Why (apply and '(1 2 3)) doesn't work while (and 1 2 3) works in R5RS?. And none of this required any GENSYMing!
In SICP you are given two ways of implementing or. One that handles them as special forms which is trivial and one as derived expressions. I'm unsure if they actually thought you would see this as a problem, but you can do it by implementing gensym or altering variable? and how you make derived variables like this:
;; a unique tag to identify special variables
(define id (vector 'id))
;; a way to make such variable
(define (make-var x)
(list id x))
;; redefine variable? to handle macro-variables
(define (variable? exp)
(or (symbol? exp)
(tagged-list? exp id)))
;; makes combinations so that you don't evaluate
;; every part twice in case of side effects (set!)
(define (or->combination terms)
(if (null? terms)
'false
(let ((tmp (make-var 'tmp)))
(list (make-lambda (list tmp)
(list (make-if tmp
tmp
(or->combination (cdr terms)))))
(car terms)))))
;; My original version
;; This might not be good since it uses backquotes not introduced
;; until chapter 5 and uses features from exercise 4.6
;; Though, might be easier to read for some so I'll leave it.
(define (or->combination terms)
(if (null? terms)
'false
(let ((tmp (make-var 'tmp)))
`(let ((,tmp ,(car terms)))
(if ,tmp
,tmp
,(or->combination (cdr terms)))))))
How it works is that make-var creates a new list every time it is called, even with the same argument. Since it has id as it's first element variable? will identify it as a variable. Since it's a list it will only match in variable lookup with eq? if it is the same list, so several nested or->combination tmp-vars will all be seen as different by lookup-variable-value since (eq? (list) (list)) => #f and special variables being lists they will never shadow any symbol in code.
This is influenced by eiod, by Al Petrofsky, which implements syntax-rules in a similar manner. Unless you look at others implementations as spoilers you should give it a read.

Implenting simultaneous bindings

I'm writing a Lisp (code at GitHub) and I want to implement local bindings. Currently I have two syntaxes:
(let <var> <val> <expr>)
for binding a single variable or function, and
(with (<var1> <val1> ... <varN> <valN>) <expr>)
to bind multiple values at once.
At present, the bindings are evaluated sequentially, and each new function binding retains a copy of the environment it was defined in, so <var2> can refer to <var1> but not vice-versa.
I would like to modify the code so that when binding multiple values at once you effectively have simultaneous binding. For example, I would like to be able to write (this is a trivial example, but it should illustrate the idea):
(define (h y)
(with ((f x) (if (eq? x 0) #t (g (- x 1)))
(g x) (if (eq? x 0) #f (f (- x 1))))
(f y))
At the moment this code doesn't run - g closes over f, but not the other way around.
Is there a canonical way to implement simultaneous binding in Lisp?
In SICP there's a section on internal definitions which covers this subject. In particular, the exercises 4.16, 4.18, 4.19 tell you how to implement different strategies for achieving simultaneous definitions.
The syntax is a bit different, but the idea in the book boils down to transforming this code:
(lambda <vars>
(define u <e1>)
(define v <e2>)
<e3>)
Into this code:
(lambda <vars>
(let ((u '*unassigned*)
(v '*unassigned*))
(set! u <e1>)
(set! v <e2>)
<e3>))
The same idea applies to your with special form. Take a look at the linked book for more implementation details.
In (with (a (+ 2 2))) we are binding a to the value of the expression (+ 2 2), so a becomes 4. But in (with ((f x) (+ x x))) we are doing something else: we are binding f to a function. This is a syntactic sugar for (with (f (lambda (x) (+ x x)))).
To handle this situation, you have to process the bindings in two passes. First collect all the variables and create the environment which contains all of them. Then evaluate the initializing experssions and store their values in the corresponding variables. The evaluation of these expressions takes place in that environment, so every expression has visibility over all the variables. The initialization is done by assignment. The variables can be initially nil or have some trap value which blows up if they are accessed.