Related
This question already has answers here:
setq and defvar in Lisp
(4 answers)
Closed 6 years ago.
I was following a tutorial on lisp and they did the following code
(set 'x 11)
(incf x 10)
and interpreter gave the following error:
; in: INCF X
; (SETQ X #:NEW671)
;
; caught WARNING:
; undefined variable: X
;
; compilation unit finished
; Undefined variable:
; X
; caught 1 WARNING condition
21
what is the proper way to increment x ?
This is indeed how you are meant to increment x, or at least one way of doing so. However it is not how you are meant to bind x. In CL you need to establish a binding for a name before you use it, and you don't do that by just assigning to it. So, for instance, this code (in a fresh CL image) is not legal CL:
(defun bad ()
(setf y 2))
Typically this will cause a compile-time warning and a run-time error, although it may do something else: its behaviour is not defined.
What you have done, in particular, is actually worse than this: you have rammed a value into the symbol-value of x (with set, which does this), and then assumed that something like (incf x) will work, which it is extremely unlikely to do. For instance consider something like this:
(defun worse ()
(let ((x 2))
(set 'x 4)
(incf x)
(values x (symbol-value 'x))))
This is (unlike bad) legal code, but it probably does not do what you want it to do.
Many CL implementations do allow assignment to previously unbound variables at the top-level, because in a conversational environment it is convenient. But the exact meaning of such assignments is outwith the language standard.
CMUCL and its derivatives, including SBCL, have historically been rather more serious about this than other implementations were at the time. I think the reason for this is that the interpreter was a bunch more serious than most others and/or they secretly compiled everything anyway and the compiler picked things up.
A further problem is that CL has slightly awkward semantics for top-level variables: if you go to the effort to establish a toplevel binding in the normal way, with defvar & friends, then you also cause the variable to be special -- dynamically scoped -- and this is a pervasive effect: it makes all bindings of that name special. That is often a quite undesirable consequence. CL, as a language, has no notion of a top-level lexical variable.
What many implementations did, therefore, was to have some kind of informal notion of a top-level binding of something which did not imply a special declaration: if you just said (setf x 3) at the toplevel then this would not contage the entire environment. But then there were all sorts of awkward questions: after doing that, what is the result of (symbol-value 'x) for instance?
Fortunately CL is a powerful language, and it is quite possible to define top-level lexical variables within the language. Here is a very hacky implementation called deflexical. Note that there are better implementations out there (including at least one by me, which I can't find right now): this is not meant to be a bullet-proof solution.
(defmacro deflexical (var &optional value)
;; Define a cheap-and-nasty global lexical variable. In this
;; implementation, global lexicals are not boundp and the global
;; lexical value is not stored in the symbol-value of the symbol.
;;
;; This implementation is *not* properly thought-through and is
;; without question problematic
`(progn
(define-symbol-macro ,var (get ',var 'lexical-value))
(let ((flag (cons nil nil)))
;; assign a value only if there is not one already, like DEFVAR
(when (eq (get ',var 'lexical-value flag) flag)
(setf (get ',var 'lexical-value) ,value))
;; Return the symbol
',var)))
This question already has answers here:
Why do some people use keywords for the clauses in the loop macro?
(3 answers)
Closed 8 years ago.
I just saw the answer of Sylwester to this question, and I thought strange that the loop has colons everywhere.
Usually, I would write
(loop for n below 10 do (princ n) (terpri))
instead of
(loop :for n :below 10 :do (princ n) (terpri))
After some tests, I see that with the first loop, the symbols for, below and do are then part of cl-user (edit : actually not do, only the other two, probably because do is also a macro in the cl package), not with the second. Likewise, a 'X alone will then be part of cl-user, not ':X. The symbol-package function tells me the latter is in the keyword package.
Now, the first loop, without colons, looks much prettier to me, so I would like to know if the preceding remarks are a good reason to use the second one instead. That the symbols become "included" in the current package looks rather inoffensive, but maybe I have overlooked the consequences.
Any idea?
You moslty already answered your own question, the difference is as you described. :some-symbol will be in :KEYWORD package, and 'SOME-SYMBOL will be in your current package CL-USER by default. In loop macro it's just a matter of taste. Some people prefer to use :for notation to get better syntax highlighting in their text editor, for example.
CL-USER 23 > (find-symbol "LOOP" "CL")
LOOP
:EXTERNAL
CL-USER 24 > (find-symbol "FOR" "CL")
NIL
NIL
LOOP is a symbol in the COMMON-LISP package and it is exported. FOR is neither. Thus in every package which does not have a FOR symbol and does not inherit one, one will add such a symbol when writing a LOOP FOR loop.
That's it. Usually that should be no problem...
I'm learning Lisp from the book 'Practical Common Lisp'. At one point, I'm supposed to enter the following bit of code:
[1] (remove-if-not #'evenp '(1 2 3 4 5 6 7 8 9 10))
(2 4 6 8 10)
I suppose the idea here is of course that remove-if-not wants a function that can return either T or NIL when an argument is provided to it, and this function is then applied to all symbols in the list, returning a list containing only those symbols where it returned NIL.
However, if I now write the following code in CLISP:
[2] (remove-if-not 'evenp '(1 2 3 4 5 6 7 8 9 10)
(2 4 6 8 10)
It still works! So my question is, does it even matter whether I use sharp-quote notation, or is just using the quote sufficient? It now seems like the additional sharp is only there to let the programmer know that "Hey, this is a function, not just some random symbol!" - but if it has any other use, I'd love to know about it.
I use GNU CLISP 2.49 (2010-07-07, sheesh that's actually pretty old).
Sharp-quote and quote do not have the same behaviour in the general case:
(defun test () 'red)
(flet ((test () 'green))
(list (funcall 'test)
(funcall #'test))) => (red green)
Calling a quoted symbol will use the function value of the quoted symbol (ie, the result of symbol-function). Calling a sharp-quoted symbol will use the value established by the lexical binding, if any, of the symbol. In the admittedly common case that there is no lexical binding the behaviour will be the same. That's what you are seeing.
You should get into the habit of using sharp-quote. Ignoring function bindings is probably not what you want, and may be confusing to anybody trying to understand your code.
This is not CLISP specific, it works in every Common Lisp implementation (I use Clozure Common Lisp here).
What happens is that if you give a symbol as a function designator then the implementation will look up the symbol-function (assuming the symbol is available in the global environment) for you:
? #'evenp
#<Compiled-function EVENP #x3000000F2D4F>
? (symbol-function 'evenp)
#<Compiled-function EVENP #x3000000F2D4F>
In general you can use either, but there's an interesting effect if you rebind the called function later. If you specify the function (#' or (function)) then the calls will still call the old function because the lookup has been done at compile time; if you use the symbol then you will call the new function because the lookup is re-done at runtime. Note that this may be implementation-specific.
As you have noticed (or read) funcall et. al. are will make an effort to convert the function argument you provide into something approprate. So as you have noticed they will take a symbol and then fetch the symbol-function of that symbol; if that works out they will then invoke that.
Recall that #'X is converted at readtime into (symbol-function x) and 'x into (quote x). It's good practice to have the symbol-function work done at compile time.
But why? Well two trival reasons it is slightly faster and it signals that you don't intend to redefine F's symbol-function after compile time. Another reason is that in a recent Pew Research study 98.3% of Lisp developers prefer it, and 62.3% will shun those that don't do this.
But there's more.
'(lambda (..) ...) is quite different v.s. #'(lambda (..) ...). The first is very likely to end up using eval, i.e. it will be slow. The first runs in a different scope v.s. the second one, i.e. only the second one can see the lexical scope it appears in.
I've learned Clojure previously and really like the language. I also love Emacs and have hacked some simple stuff with Emacs Lisp. There is one thing which prevents me mentally from doing anything more substantial with Elisp though. It's the concept of dynamic scoping. I'm just scared of it since it's so alien to me and smells like semi-global variables.
So with variable declarations I don't know which things are safe to do and which are dangerous. From what I've understood, variables set with setq fall under dynamic scoping (is that right?) What about let variables? Somewhere I've read that let allows you to do plain lexical scoping, but somewhere else I read that let vars also are dynamically scoped.
I quess my biggest worry is that my code (using setq or let) accidentally breaks some variables from platform or third-party code that I call or that after such call my local variables are messed up accidentally. How can I avoid this?
Are there a few simple rules of thumb that I can just follow and know exactly what happens with the scope without being bitten in some weird, hard-to-debug way?
It isn't that bad.
The main problems can appear with 'free variables' in functions.
(defun foo (a)
(* a b))
In above function a is a local variable. b is a free variable. In a system with dynamic binding like Emacs Lisp, b will be looked up at runtime. There are now three cases:
b is not defined -> error
b is a local variable bound by some function call in the current dynamic scope -> take that value
b is a global variable -> take that value
The problems can then be:
a bound value (global or local) is shadowed by a function call, possibly unwanted
an undefined variable is NOT shadowed -> error on access
a global variable is NOT shadowed -> picks up the global value, which might be unwanted
In a Lisp with a compiler, compiling the above function might generate a warning that there is a free variable. Typically Common Lisp compilers will do that. An interpreter won't provide that warning, one just will see the effect at runtime.
Advice:
make sure that you don't use free variables accidentally
make sure that global variables have a special name, so that they are easy to spot in source code, usually *foo-var*
Don't write
(defun foo (a b)
...
(setq c (* a b)) ; where c is a free variable
...)
Write:
(defun foo (a b)
...
(let ((c (* a b)))
...)
...)
Bind all variables you want to use and you want to make sure that they are not bound somewhere else.
That's basically it.
Since GNU Emacs version 24 lexical binding is supported in its Emacs Lisp. See: Lexical Binding, GNU Emacs Lisp Reference Manual.
In addition to the last paragraph of Gilles answer, here is how RMS argues in favor of dynamic scoping in an extensible system:
Some language designers believe that
dynamic binding should be avoided, and
explicit argument passing should be
used instead. Imagine that function A
binds the variable FOO, and calls the
function B, which calls the function
C, and C uses the value of FOO.
Supposedly A should pass the value as
an argument to B, which should pass it
as an argument to C.
This cannot be done in an extensible
system, however, because the author of
the system cannot know what all the
parameters will be. Imagine that the
functions A and C are part of a user
extension, while B is part of the
standard system. The variable FOO does
not exist in the standard system; it
is part of the extension. To use
explicit argument passing would
require adding a new argument to B,
which means rewriting B and everything
that calls B. In the most common case,
B is the editor command dispatcher
loop, which is called from an awful
number of places.
What's worse, C must also be passed an
additional argument. B doesn't refer
to C by name (C did not exist when B
was written). It probably finds a
pointer to C in the command dispatch
table. This means that the same call
which sometimes calls C might equally
well call any editor command
definition. So all the editing
commands must be rewritten to accept
and ignore the additional argument. By
now, none of the original system is
left!
Personally, I think that if there is a problem with Emacs-Lisp, it is not dynamic scoping per se, but that it is the default, and that it is not possible to achieve lexical scoping without resorting to extensions. In CL, both dynamic and lexical scoping can be used, and -- except for top-level (which is adressed by several deflex-implementations) and globally declared special variables -- the default is lexical scoping. In Clojure, too, you can use both lexical and dynamic scoping.
To quote RMS again:
It is not necessary for dynamic scope to be the only scope rule provided, just useful
for it to be available.
Are there a few simple rules of thumb that I can just follow and know exactly what happens with the scope without being bitten in some weird, hard-to-debug way?
Read Emacs Lisp Reference, you'll have many details like this one :
Special Form: setq [symbol form]...
This special form is the most common method of changing a
variable's value. Each SYMBOL is given a new value, which is the
result of evaluating the corresponding FORM. The most-local
existing binding of the symbol is changed.
Here is an example :
(defun foo () (setq tata "foo"))
(defun bar (tata) (setq tata "bar"))
(foo)
(message tata)
===> "foo"
(bar tata)
(message tata)
===> "foo"
As Peter Ajtai pointed out:
Since emacs-24.1 you can enable lexical scoping on a per file basis by putting
;; -*- lexical-binding: t -*-
on top of your elisp file.
First, elisp has separate variable and function bindings, so some pitfalls of dynamic scoping are not relevant.
Second, you can still use setq to set variables, but the value set does not survive the exit of the dynamic scope it is done in. This isn't, fundamentally, different from lexical scoping, with the difference that with dynamic scoping a setq in a function you call can affect the value you see after the function call.
There's lexical-let, a macro that (essentially) imitates lexical bindings (I believe it does this by walking the body and changing all occurrences of the lexically let variables to a gensymmed name, eventually uninterning the symbol), if you absolutely need to.
I'd say "write code as normal". There are times when the dynamic nature of elisp will bite you, but I've found that in practice that is surprisingly seldom.
Here's an example of what I was saying about setq and dynamically-bound variables (recently evaluated in a nearby scratch buffer):
(let ((a nil))
(list (let ((a nil))
(setq a 'value)
a)
a))
(value nil)
Everything that has been written here is worthwhile. I would add this: get to know Common Lisp -- if nothing else, read about it. CLTL2 presents lexical and dynamic binding well, as do other books. And Common Lisp integrates them well in a single language.
If you "get it" after some exposure to Common Lisp then things will be clearer for you for Emacs Lisp. Emacs 24 uses lexical scoping to a greater extent by default than older versions, but Common Lisp's approach will still be clearer and cleaner (IMHO). Finally, it is definitely the case that dynamic scope is important for Emacs Lisp, for the reasons that RMS and others have emphasized.
So my suggestion is to get to know how Common Lisp deals with this. Try to forget about Scheme, if that is your main mental model of Lisp -- it will limit you more than help you in understanding scoping, funargs, etc. in Emacs Lisp. Emacs Lisp, like Common Lisp, is "dirty and low-down"; it is not Scheme.
Dynamic and lexical scoping have different behaviors when a piece of code is used in a different scope than the one it was defined in. In practice, there are two patterns that cover most troublesome cases:
A function shadows a global variable, then calls another function that uses that global variable.
(defvar x 3)
(defun foo ()
x)
(defun bar (x)
(+ (foo) x))
(bar 0) ⇒ 0
This doesn't come up often in Emacs because local variables tend to have short names (often single-word) whereas global variables tend to have long names (often prefixed by packagename-). Many standard functions have names that are tempting to use as local variables like list and point, but functions and variables live in separate name spaces are local functions are not used very often.
A function is defined in one lexical context and used outside this lexical context because it's passed to a higher-order function.
(let ((cl-y 10))
(mapcar* (lambda (elt) (* cl-y elt)) '(1 2 3)))
⇒ (10 20 30)
(let ((cl-x 10))
(mapcar* (lambda (elt) (* cl-x elt)) '(1 2 3)))
⇑ (wrong-type-argument number-or-marker-p (1 2 3))
The error is due to the use of cl-x as a variable name in mapcar* (from the cl package). Note that the cl package uses cl- as a prefix even for its local variables in higher-order functions. This works reasonably well in practice, as long as you take care not to use the same variable as a global name and as a local name, and you don't need to write a recursive higher-order function.
P.S. Emacs Lisp's age isn't the only reason why it's dynamically scoped. True, in those days, lisps tended towards dynamic scoping — Scheme and Common Lisp hadn't really taken on yet. But dynamic scoping is also an asset in a language targeted towards extending a system dynamically: it lets you hook into more places without any special effort. With great power comes great rope to hang yourself: you risk accidentally hooking into a place you didn't know about.
The other answers are good at explaining the technical details on how to work with dynamic scoping, so here's my non-technical advice:
Just do it
I've been tinkering with Emacs lisp for 15+ years and don't know that I've ever been bitten by any problems due to the differences between lexical/dynamic scope.
Personally, I've not found the need for closures (I love 'em, just don't need them for Emacs). And, I generally try to avoid global variables in general (whether the scoping was lexical or dynamic).
So I suggest jumping in and writing customizations that suit your needs/desires, chances are you won't have any problems.
I entirely feel your pain. I find the lack of lexical binding in emacs rather annoying - especially not being able to use lexical closures, which seems to be a solution I think of a lot, coming from more modern languages.
While I don't have any more advice on working around the lacking features that the previous answers didn't cover yet, I'd like to point out the existance of an emacs branch called `lexbind', implementing lexical binding in a backward-compatible way. In my experience lexical closures are still a little buggy in some circumstances, but that branch appears to a promising approach.
Just don't.
Emacs-24 lets you use lexical-scope. Just run
(setq lexical-binding t)
or add
;; -*- lexical-binding: t -*-
at the beginning of your file.
So... I'm new to scheme r6rs, and am learning macros. Can somebody explain to me what is meant by 'hygiene'?
Thanks in advance.
Hygiene is often used in the context of macros. A hygienic macro doesn't use variable names that can risk interfering with the code under expansion. Here is an example. Let's say we want to define the or special form with a macro. Intuitively,
(or a b c ... d) would expand to something like (let ((tmp a)) (if tmp a (or b c ... d))). (I am omitting the empty (or) case for simplicity.)
Now, if the name tmp was actually added in the code like in the above sketched expansion, it would be not hygienic, and bad because it might interfere with another variable with the same name. Say, we wanted to evaluate
(let ((tmp 1)) (or #f tmp))
Using our intuitive expansion, this would become
(let ((tmp 1)) (let ((tmp #f)) (if tmp (or tmp)))
The tmp from the macro shadows the outer-most tmp, and so the result is #f instead of 1.
Now, if the macro was hygienic (and in Scheme, it's automatically the case when using syntax-rules), then instead of using the name tmp for the expansion, you would use a symbol that is guaranteed not to appear anywhere else in the code. You can use gensym in Common Lisp.
Paul Graham's On Lisp has advanced material on macros.
If you imagine that a macro is simply expanded into the place where it is used, then you can also imagine that if you use a variable a in your macro, there might already be a variable a defined at the place where that macro is used.
This is not the a that you want!
A macro system in which something like this cannot happen, is called hygienic.
There are several ways to deal with this problem. One way is simply to use very long, very cryptic, very unpredictable variable names in your macros.
A slightly more refined version of this is the gensym approach used by some other macro systems: instead of you, the programmer coming up with a very long, very cryptic, very unpredictable variable name, you can call the gensym function which generates a very long, very cryptic, very unpredictable and unique variable name for you.
And like I said, in a hygienic macro system, such collisions cannot happen in the first place. How to make a macro system hygienic is an interesting question in itself, and the Scheme community has spent several decades on this question, and they keep coming up with better and better ways to do it.
I'm so glad to know that this language is still being used! Hygienic code is code that when injected (via a macro) does not cause conflicts with existing variables.
There is lots of good information on Wikipedia about this: http://en.wikipedia.org/wiki/Hygienic_macro
Here's what I found. Explaining what it means is another matter altogether!
http://www.r6rs.org/final/html/r6rs-lib/r6rs-lib-Z-H-1.html#node_toc_node_sec_12.1
Macros transform code: they take one bit of code and transform it into something else. As part of that transformation, they may surround that code with more code. If the original code references a variable a, and the code that's added around it defines a new version of a, then the original code won't work as expected because it will be accessing the wrong a: if
(myfunc a)
is the original code, which expects a to be an integer, and the macro takes X and transforms it to
(let ((a nil)) X)
Then the macro will work fine for
(myfunc b)
but (myfunc a) will get transformed to
(let ((a nil)) (myfunc a))
which won't work because myfunc will be applied to nil rather than the integer it is expecting.
A hygienic macro avoids this problem of the wrong variable getting accessed (and a similar problem the other way round), by ensuring that the names used are unique.
Wikipedia has a good explanation of hygienic macros.
Apart from all the things mentioned, there is one important other thing to Scheme's hygienic macros, which follow from the lexical scope.
Say we have:
(syntax-rules () ((_ a b) (+ a b)))
As part of a macro, surely it will insert the +, it will also insert it when there's a + already there, but then another symbol which has the same meaning as +. It binds symbols to the value they had in the lexical environment in which the syntax-rules lies, not where it is applied, we are lexically scoped after all. It will most likely insert a completely new symbol there, but one which is globally bound to the same meaning as + is at the place the macro is defined. This is most handy when we use a construct like:
(let ((+ *))
; piece of code that is transformed
)
The writer, or user of the macro thus needn't be occupied with ensuring its use goes well.