What is the difference between define vs define-syntax and let vs let-syntax when value is syntax-rules? - macros

I'm in a process of implementing Hygienic macros in my Scheme implementation, I've just implemented syntax-rules, but I have this code:
(define odd?
(syntax-rules ()
((_ x) (not (even? x)))))
what should be the difference between that and this:
(define-syntax odd?
(syntax-rules ()
((_ x) (not (even? x)))))
from what I understand syntax-rules just return syntax transformer, why you can't just use define to assign that to symbol? Why I need to use define-syntax? What extra stuff that expression do?
Should first also work in scheme? Or only the second one?
Also what is the difference between let vs let-syntax and letrec vs letrec-syntax. Should (define|let|letrec)-syntax just typecheck if the value is syntax transformer?
EDIT:
I have this implementation, still using lisp macros:
;; -----------------------------------------------------------------------------
(define-macro (let-syntax vars . body)
`(let ,vars
,#(map (lambda (rule)
`(typecheck "let-syntax" ,(car rule) "syntax"))
vars)
,#body))
;; -----------------------------------------------------------------------------
(define-macro (letrec-syntax vars . body)
`(letrec ,vars
,#(map (lambda (rule)
`(typecheck "letrec-syntax" ,(car rule) "syntax"))
vars)
,#body))
;; -----------------------------------------------------------------------------
(define-macro (define-syntax name expr)
(let ((expr-name (gensym)))
`(define ,name
(let ((,expr-name ,expr))
(typecheck "define-syntax" ,expr-name "syntax")
,expr-name))))
This this code correct?
Should this code works?
(let ((let (lambda (x) x)))
(let-syntax ((odd? (syntax-rules ()
((_ x) (not (even? x))))))
(odd? 11)))

This question seems to imply some deep confusion about macros.
Let's imagine a language where syntax-rules returns some syntax transformer function (I am not sure this has to be true in RnRS Scheme, it is true in Racket I think), and where let and let-syntax were the same.
So let's write this function:
(define (f v)
(let ([g v])
(g e (i 10)
(if (= i 0)
i
(e (- i 1))))))
Which we can turn into this, of course:
(define (f v n)
(v e (i n)
(if (<= i 0)
i
(e (- i 1)))))
And I will tell you in addition that there is no binding for e or i in the environment.
What is the interpreter meant to do with this definition? Could it compile it? Could it safely infer that i can't possibly make any sense since it is used as a function and then as a number? Can it safely do anything at all?
The answer is that no, it can't. Until it knows what the argument to the function is it can't do anything. And this means that each time f is called it has to make that decision again. In particular, v might be:
(syntax-rules ()
[(_ name (var init) form ...)
(letrec ([name (λ (var)
form ...)])
(name init))]))
Under which the definition of f does make some kind of sense.
And things get worse: much worse. How about this?
(define (f v1 v2 n)
(let ([v v1])
(v e (i n)
...
(set! v (if (eq? v v1) v2 v1))
...)))
What this means is that a system like this wouldn't know what the code it was meant to interpret meant until, the moment it was interpreting it, or even after that point, as you can see from the second function above.
So instead of this horror, Lisps do something sane: they divide the process of evaluating bits of code into phases where each phase happens, conceptually, before the next one.
Here's a sequence for some imagined Lisp (this is kind of close to what CL does, since most of my knowledge is of that, but it is not intended to represent any particular system):
there's a phase where the code is turned from some sequence of characters to some object, possibly with the assistance of user-defined code;
there's a phase where that object is rewritten into some other object by user- and system-defined code (macros) – the result of this phase is something which is expressed in terms of functions and some small number of primitive special things, traditionally called 'special forms' which are known to the processes of stage 3 and 4;
there may be a phase where the object from phase 2 is compiled, and that phase may involve another set of user-defined macros (compiler macros);
there is a phase where the resulting code is evaluated.
And for each unit of code these phases happen in order, each phase completes before the next one begins.
This means that each phase in which the user can intervene needs its own set of defining and binding forms: it needs to be possible to say that 'this thing controls what happens at phase 2' for instance.
That's what define-syntax, let-syntax &c do: they say that 'these bindings and definitions control what happens at phase 2'. You can't, for instance, use define or let to do that, because at phase 2, these operations don't yet have meaning: they gain meaning (possibly by themselves being macros which expand to some primitive thing) only at phase 3. At phase 2 they are just bits of syntax which the macro is ingesting and spitting out.

Related

In Racket, can we print the source code of a function returned by another high-order one? [duplicate]

In one variant of Common Lisp (I think it was CMUCL, but I might be wrong—I can't find it any more) there was a function that was (I think) called function-lambda-expression. If it got a procedure, it would print out the lambda expression that had generated it. Example:
(let ((my-thunk (lambda () (+ 1 2))))
(write my-thunk)
(write (function-lambda-expression my-thunk)))
This would print out something like:
#<PROCEDURE>
(LAMBDA () (+ 1 2))
It was terribly useful for debugging and exploring the language.
I'm looking for a function like this in Racket. I've looked through the Racket Documentation but I can't find anything like it. (I wouldn't be surprised if I overlooked it, however.) Is there an equivalent in Racket?
No. Racket's lambda produces a closure that does not remember its S-expression (or syntax object) form. It does typically remember its name (or its abbreviated source location, if no name can be inferred), and that's often enough to help with debugging. (See object-name.)
You can build your own variant of lambda that has this feature, using Racket's applicable structures and a simple macro. Here's a basic example:
#lang racket
(struct exp-closure (f exp)
#:property prop:procedure (struct-field-index f))
(define-syntax-rule (exp-lambda formals . body)
(exp-closure (lambda formals . body)
(quote (exp-lambda formals . body))))
(let ([my-thunk (exp-lambda () (+ 1 2))])
(printf "fun is ~v\n" my-thunk)
(printf "exp is ~v\n" (exp-closure-exp my-thunk))
(printf "result is ~v\n" (my-thunk)))
This produces
fun is #<procedure:...tmp/lambda.rkt:11:19>
exp is '(exp-lambda () (+ 1 2))
result is 3
A better version of this macro might propagate the source location of the macro use to the lambda expression it creates, or the inferred name (see syntax-local-infer-name), or both.

Is it possible for a variable transformer to work with non-literal tokens?

make-variable-transformer (or make-set!-transformer, as it is called in Racket) can work with identifiers hardcoded in the literal list of the macro definition. The following example with set! invariably comes up when variable transformers are mentioned:
(make-variable-transformer
(λ (stx)
(syntax-case stx (set!)
((set! id _) ...)
(id ...))))
This is nice and all, useful for transparently integrating foreign structures with primitive operations known ahead of time, and it's a bonus that it can work through identifier syntax and rename transformers.
But what I'm wondering is if it's possible to work with syntax dynamically like:
(let-syntax ((# (make-variable-transformer
(λ (stx)
(syntax-case stx ()
((v # i) (vector? #'v) #'(vector-ref v i)))))))
(#(0 1 2) # 1))
=> 1
This doesn't work because the macro call doesn't match the template as syntax-case expects # to be in the initial position since there's no v in the literal list (and it probably assigns # to v pattern variable).
In short: Is it possible to write a syntax transformer that can accomplish this without reader extensions or overriding application, perhaps through a metamacro that rewrites the literal token list of an inner syntax-case (à la Petrofsky extraction)?
NB: The vector example itself is unimportant and I'm not interested in alternative solutions to this exact use-case.
since there's no v in the literal list (and it probably assigns # to v pattern variable).
Not really. set! is a special case that the macro expander handles specifically to make it cooperate with make-variable-transformer. But for other literals, they will fail. E.g.,
(let-syntax ((# (make-variable-transformer
(λ (stx)
(syntax-case stx (v)
((v # i) #'1))))))
(v # 1))
fails with v: unbound identifier.
The second issue with your above code is the side condition (vector? #'v). #'v is a syntax object, so (vector? #'v) will always result in #f. It's unclear what is the right behavior. For example, do you intend for:
(define v (vector 1 2 3))
(v # 1)
to work? If so, a compile-time side condition would be inappropriate, because it's not known if v is a vector at compile-time.
For your main question, the answer is no. It's not possible under the constraints that you imposed. The expansion steps are detailed here, and none of the steps looks beyond the head of the list.
But if we don't care about the constraints. I.e., overriding #%app is OK. It could work.
An issue that you need to think about is, suppose you have (a b c) where b is your kind of macro and a is a regular macro. Who should get the control first? If a should get the control first, you can override #%app to implement this kind of macro. Here's my quick implementation.
#lang racket
(require syntax/parse/define
(only-in racket [#%app racket:#%app])
(for-syntax syntax/apply-transformer))
(begin-for-syntax
(struct my-transformer (t)))
(define-syntax-parser #%app
[(_ x ...)
(define transformer
(for/first ([operand (attribute x)]
#:when (and (identifier? operand)
(my-transformer?
(syntax-local-value operand (λ () #f)))))
(syntax-local-value operand)))
(cond
[transformer (local-apply-transformer
(my-transformer-t transformer)
#'(x ...)
'expression)]
[else #'(racket:#%app x ...)])])
(define-syntax #
(my-transformer
(syntax-parser
[(v _ i) #'(vector-ref v i)])))
(define v (vector 42 1337 1729))
(v # 1) ;=> 1337
Finally, you can always override #%module-begin and simulate the macro expander. It's an overkill solution, but could be appropriate if you want more advanced features, like allowing users to customize precedence so that b is expanded before a.

SICP: Can or be defined in lisp as a syntactic transformation without gensym?

I am trying to solve the last part of question 4.4 of the Structure and Interpretation of computer programming; the task is to implement or as a syntactic transformation. Only elementary syntactic forms are defined; quote, if, begin, cond, define, apply and lambda.
(or a b ... c) is equal to the first true value or false if no value is true.
The way I want to approach it is to transform for example (or a b c) into
(if a a (if b b (if c c false)))
the problem with this is that a, b, and c would be evaluated twice, which could give incorrect results if any of them had side-effects. So I want something like a let
(let ((syma a))
(if syma syma (let ((symb b))
(if symb symb (let ((symc c))
(if (symc symc false)) )) )) )
and this in turn could be implemented via lambda as in Exercise 4.6. The problem now is determining symbols syma, symb and symc; if for example the expression b contains a reference to the variable syma, then the let will destroy the binding. Thus we must have that syma is a symbol not in b or c.
Now we hit a snag; the only way I can see out of this hole is to have symbols that cannot have been in any expression passed to eval. (This includes symbols that might have been passed in by other syntactic transformations).
However because I don't have direct access to the environment at the expression I'm not sure if there is any reasonable way of producing such symbols; I think Common Lisp has the function gensym for this purpose (which would mean sticking state in the metacircular interpreter, endangering any concurrent use).
Am I missing something? Is there a way to implement or without using gensym? I know that Scheme has it's own hygenic macro system, but I haven't grokked how it works and I'm not sure whether it's got a gensym underneath.
I think what you might want to do here is to transform to a syntactic expansion where the evaluation of the various forms aren't nested. You could do this, e.g., by wrapping each form as a lambda function and then the approach that you're using is fine. E.g., you can do turn something like
(or a b c)
into
(let ((l1 (lambda () a))
(l2 (lambda () b))
(l3 (lambda () c)))
(let ((v1 (l1)))
(if v1 v1
(let ((v2 (l2)))
(if v2 v2
(let ((v3 (l3)))
(if v3 v3
false)))))))
(Actually, the evaluation of the lambda function calls are still nested in the ifs and lets, but the definition of the lambda functions are in a location such that calling them in the nested ifs and lets doesn't cause any difficulty with captured bindings.) This doesn't address the issue of how you get the variables l1–l3 and v1–v3, but that doesn't matter so much, none of them are in scope for the bodies of the lambda functions, so you don't need to worry about whether they appear in the body or not. In fact, you can use the same variable for all the results:
(let ((l1 (lambda () a))
(l2 (lambda () b))
(l3 (lambda () c)))
(let ((v (l1)))
(if v v
(let ((v (l2)))
(if v v
(let ((v (l3)))
(if v v
false)))))))
At this point, you're really just doing loop unrolling of a more general form like:
(define (functional-or . functions)
(if (null? functions)
false
(let ((v ((first functions))))
(if v v
(functional-or (rest functions))))))
and the expansion of (or a b c) is simply
(functional-or (lambda () a) (lambda () b) (lambda () c))
This approach is also used in an answer to Why (apply and '(1 2 3)) doesn't work while (and 1 2 3) works in R5RS?. And none of this required any GENSYMing!
In SICP you are given two ways of implementing or. One that handles them as special forms which is trivial and one as derived expressions. I'm unsure if they actually thought you would see this as a problem, but you can do it by implementing gensym or altering variable? and how you make derived variables like this:
;; a unique tag to identify special variables
(define id (vector 'id))
;; a way to make such variable
(define (make-var x)
(list id x))
;; redefine variable? to handle macro-variables
(define (variable? exp)
(or (symbol? exp)
(tagged-list? exp id)))
;; makes combinations so that you don't evaluate
;; every part twice in case of side effects (set!)
(define (or->combination terms)
(if (null? terms)
'false
(let ((tmp (make-var 'tmp)))
(list (make-lambda (list tmp)
(list (make-if tmp
tmp
(or->combination (cdr terms)))))
(car terms)))))
;; My original version
;; This might not be good since it uses backquotes not introduced
;; until chapter 5 and uses features from exercise 4.6
;; Though, might be easier to read for some so I'll leave it.
(define (or->combination terms)
(if (null? terms)
'false
(let ((tmp (make-var 'tmp)))
`(let ((,tmp ,(car terms)))
(if ,tmp
,tmp
,(or->combination (cdr terms)))))))
How it works is that make-var creates a new list every time it is called, even with the same argument. Since it has id as it's first element variable? will identify it as a variable. Since it's a list it will only match in variable lookup with eq? if it is the same list, so several nested or->combination tmp-vars will all be seen as different by lookup-variable-value since (eq? (list) (list)) => #f and special variables being lists they will never shadow any symbol in code.
This is influenced by eiod, by Al Petrofsky, which implements syntax-rules in a similar manner. Unless you look at others implementations as spoilers you should give it a read.

scheme continuations for dummies

For the life of me, I can't understand continuations. I think the problem stems from the fact that I don't understand is what they are for. All the examples that I've found in books or online are very trivial. They make me wonder, why anyone would even want continuations?
Here's a typical impractical example, from TSPL, which I believe is quite recognized book on the subject. In english, they describe the continuation as "what to do" with the result of a computation. OK, that's sort of understandable.
Then, the second example given:
(call/cc
(lambda (k)
(* 5 (k 4)))) => 4
How does this make any sense?? k isn't even defined! How can this code be evaluated, when (k 4) can't even be computed? Not to mention, how does call/cc know to rip out the argument 4 to the inner most expression and return it? What happens to (* 5 .. ?? If this outermost expression is discarded, why even write it?
Then, a "less" trivial example stated is how to use call/cc to provide a nonlocal exit from a recursion. That sounds like flow control directive, ie like break/return in an imperative language, and not a computation.
And what is the purpose of going through these motions? If somebody needs the result of computation, why not just store it and recall later, as needed.
Forget about call/cc for a moment. Every expression/statement, in any programming language, has a continuation - which is, what you do with the result. In C, for example,
x = (1 + (2 * 3));
printf ("Done");
has the continuation of the math assignment being printf(...); the continuation of (2 * 3) is 'add 1; assign to x; printf(...)'. Conceptually the continuation is there whether or not you have access to it. Think for a moment what information you need for the continuation - the information is 1) the heap memory state (in general), 2) the stack, 3) any registers and 4) the program counter.
So continuations exist but usually they are only implicit and can't be accessed.
In Scheme, and a few other languages, you have access to the continuation. Essentially, behind your back, the compiler+runtime bundles up all the information needed for a continuation, stores it (generally in the heap) and gives you a handle to it. The handle you get is the function 'k' - if you call that function you will continue exactly after the call/cc point. Importantly, you can call that function multiple times and you will always continue after the call/cc point.
Let's look at some examples:
> (+ 2 (call/cc (lambda (cont) 3)))
5
In the above, the result of call/cc is the result of the lambda which is 3. The continuation wasn't invoked.
Now let's invoke the continuation:
> (+ 2 (call/cc (lambda (cont) (cont 10) 3)))
12
By invoking the continuation we skip anything after the invocation and continue right at the call/cc point. With (cont 10) the continuation returns 10 which is added to 2 for 12.
Now let's save the continuation.
> (define add-2 #f)
> (+ 2 (call/cc (lambda (cont) (set! add-2 cont) 3)))
5
> (add-2 10)
12
> (add-2 100)
102
By saving the continuation we can use it as we please to 'jump back to' whatever computation followed the call/cc point.
Often continuations are used for a non-local exit. Think of a function that is going to return a list unless there is some problem at which point '() will be returned.
(define (hairy-list-function list)
(call/cc
(lambda (cont)
;; process the list ...
(when (a-problem-arises? ...)
(cont '()))
;; continue processing the list ...
value-to-return)))
Here is text from my class notes: http://tmp.barzilay.org/cont.txt. It is based on a number of sources, and is much extended. It has motivations, basic explanations, more advanced explanations for how it's done, and a good number of examples that go from simple to advanced, and even some quick discussion of delimited continuations.
(I tried to play with putting the whole text here, but as I expected, 120k of text is not something that makes SO happy.
TL;DR: continuations are just captured GOTOs, with values, more or less.
The exampe you ask about,
(call/cc
(lambda (k)
;;;;;;;;;;;;;;;;
(* 5 (k 4)) ;; body of code
;;;;;;;;;;;;;;;;
)) => 4
can be approximately translated into e.g. Common Lisp, as
(prog (k retval)
(setq k (lambda (x) ;; capture the current continuation:
(setq retval x) ;; set! the return value
(go EXIT))) ;; and jump to exit point
(setq retval ;; get the value of the last expression,
(progn ;; as usual, in the
;;;;;;;;;;;;;;;;
(* 5 (funcall k 4)) ;; body of code
;;;;;;;;;;;;;;;;
))
EXIT ;; the goto label
(return retval))
This is just an illustration; in Common Lisp we can't jump back into the PROG tagbody after we've exited it the first time. But in Scheme, with real continuations, we can. If we set some global variable inside the body of function called by call/cc, say (setq qq k), in Scheme we can call it at any later time, from anywhere, re-entering into the same context (e.g. (qq 42)).
The point is, the body of call/cc form may contain an if or a condexpression. It can call the continuation only in some cases, and in others return normally, evaluating all expressions in the body of code and returning the last one's value, as usual. There can be deep recursion going on there. By calling the captured continuation an immediate exit is achieved.
So we see here that k is defined. It is defined by the call/cc call. When (call/cc g) is called, it calls its argument with the current continuation: (g the-current-continuation). the current-continuation is an "escape procedure" pointing at the return point of the call/cc form. To call it means to supply a value as if it were returned by the call/cc form itself.
So the above results in
((lambda(k) (* 5 (k 4))) the-current-continuation) ==>
(* 5 (the-current-continuation 4)) ==>
; to call the-current-continuation means to return the value from
; the call/cc form, so, jump to the return point, and return the value:
4
I won't try to explain all the places where continuations can be useful, but I hope that I can give brief examples of main place where I have found continuations useful in my own experience. Rather than speaking about Scheme's call/cc, I'd focus attention on continuation passing style. In some programming languages, variables can be dynamically scoped, and in languages without dynamically scoped, boilerplate with global variables (assuming that there are no issues of multi-threaded code, etc.) can be used. For instance, suppose there is a list of currently active logging streams, *logging-streams*, and that we want to call function in a dynamic environment where *logging-streams* is augmented with logging-stream-x. In Common Lisp we can do
(let ((*logging-streams* (cons logging-stream-x *logging-streams*)))
(function))
If we don't have dynamically scoped variables, as in Scheme, we can still do
(let ((old-streams *logging-streams*))
(set! *logging-streams* (cons logging-stream-x *logging-streams*)
(let ((result (function)))
(set! *logging-streams* old-streams)
result))
Now lets assume that we're actually given a cons-tree whose non-nil leaves are logging-streams, all of which should be in *logging-streams* when function is called. We've got two options:
We can flatten the tree, collect all the logging streams, extend *logging-streams*, and then call function.
We can, using continuation passing style, traverse the tree, gradually extending *logging-streams*, finally calling function when there is no more tree to traverse.
Option 2 looks something like
(defparameter *logging-streams* '())
(defun extend-streams (stream-tree continuation)
(cond
;; a null leaf
((null stream-tree)
(funcall continuation))
;; a non-null leaf
((atom stream-tree)
(let ((*logging-streams* (cons stream-tree *logging-streams*)))
(funcall continuation)))
;; a cons cell
(t
(extend-streams (car stream-tree)
#'(lambda ()
(extend-streams (cdr stream-tree)
continuation))))))
With this definition, we have
CL-USER> (extend-streams
'((a b) (c (d e)))
#'(lambda ()
(print *logging-streams*)))
=> (E D C B A)
Now, was there anything useful about this? In this case, probably not. Some minor benefits might be that extend-streams is tail-recursive, so we don't have a lot of stack usage, though the intermediate closures make up for it in heap space. We do have the fact that the eventual continuation is executed in the dynamic scope of any intermediate stuff that extend-streams set up. In this case, that's not all that important, but in other cases it can be.
Being able to abstract away some of the control flow, and to have non-local exits, or to be able to pick up a computation somewhere from a while back, can be very handy. This can be useful in backtracking search, for instance. Here's a continuation passing style propositional calculus solver for formulas where a formula is a symbol (a propositional literal), or a list of the form (not formula), (and left right), or (or left right).
(defun fail ()
'(() () fail))
(defun satisfy (formula
&optional
(positives '())
(negatives '())
(succeed #'(lambda (ps ns retry) `(,ps ,ns ,retry)))
(retry 'fail))
;; succeed is a function of three arguments: a list of positive literals,
;; a list of negative literals. retry is a function of zero
;; arguments, and is used to `try again` from the last place that a
;; choice was made.
(if (symbolp formula)
(if (member formula negatives)
(funcall retry)
(funcall succeed (adjoin formula positives) negatives retry))
(destructuring-bind (op left &optional right) formula
(case op
((not)
(satisfy left negatives positives
#'(lambda (negatives positives retry)
(funcall succeed positives negatives retry))
retry))
((and)
(satisfy left positives negatives
#'(lambda (positives negatives retry)
(satisfy right positives negatives succeed retry))
retry))
((or)
(satisfy left positives negatives
succeed
#'(lambda ()
(satisfy right positives negatives
succeed retry))))))))
If a satisfying assignment is found, then succeed is called with three arguments: the list of positive literals, the list of negative literals, and function that can retry the search (i.e., attempt to find another solution). For instance:
CL-USER> (satisfy '(and p (not p)))
(NIL NIL FAIL)
CL-USER> (satisfy '(or p q))
((P) NIL #<CLOSURE (LAMBDA #) {1002B99469}>)
CL-USER> (satisfy '(and (or p q) (and (not p) r)))
((R Q) (P) FAIL)
The second case is interesting, in that the third result is not FAIL, but some callable function that will try to find another solution. In this case, we can see that (or p q) is satisfiable by making either p or q true:
CL-USER> (destructuring-bind (ps ns retry) (satisfy '(or p q))
(declare (ignore ps ns))
(funcall retry))
((Q) NIL FAIL)
That would have been very difficult to do if we weren't using a continuation passing style where we can save the alternative flow and come back to it later. Using this, we can do some clever things, like collect all the satisfying assignments:
(defun satisfy-all (formula &aux (assignments '()) retry)
(setf retry #'(lambda ()
(satisfy formula '() '()
#'(lambda (ps ns new-retry)
(push (list ps ns) assignments)
(setf retry new-retry))
'fail)))
(loop while (not (eq retry 'fail))
do (funcall retry)
finally (return assignments)))
CL-USER> (satisfy-all '(or p (or (and q (not r)) (or r s))))
(((S) NIL) ; make S true
((R) NIL) ; make R true
((Q) (R)) ; make Q true and R false
((P) NIL)) ; make P true
We could change the loop a bit and get just n assignments, up to some n, or variations on that theme. Often times continuation passing style is not needed, or can make code hard to maintain and understand, but in the cases where it is useful, it can make some otherwise very difficult things fairly easy.

Scheme macro triggered by keyword which is not the head of a list

Suppose I want to trigger a Scheme macro on something other than the first item in an s-expression. For example, suppose that I wanted to replace define with an infix-style :=, so that:
(a := 5) -> (define a 5)
((square x) := (* x x)) -> (define (square x) (* x x))
The actual transformation seems to be quite straightforward. The trick will be getting Scheme to find the := expressions and macro-expand them. I've thought about surrounding large sections of code that use the infix syntax with a standard macro, maybe: (with-infix-define expr1 expr2 ...), and having the standard macro walk through the expressions in its body and perform any necessary transformations. I know that if I take this approach, I'll have to be careful to avoid transforming lists that are actually supposed to be data, such as quoted lists, and certain sections of quasiquoted lists. An example of what I envision:
(with-infix-define
((make-adder n) := (lambda (m) (+ n m)))
((foo) :=
(add-3 := (make-adder 3))
(add-6 := (make-adder 6))
(let ((a 5) (b 6))
(+ (add-3 a) (add-6 b))))
(display (foo))
(display '(This := should not be transformed))
So, my question is two-fold:
If I take the with-infix-define route, do I have to watch out for any stumbling blocks other than quote and quasiquote?
I feel a bit like I'm reinventing the wheel. This type of code walk seems like exactly what standard macro expanding systems would have to do - the only difference is that they only look at the first item in a list when deciding whether or not to do any code transformation. Is there any way I can just piggyback on existing systems?
Before you continue with this, it's best to think things over thoroughly -- IME you'd often find that what you really want a reader-level handling of := as an infix syntax. That will of course mean that it's also infix in quotations etc, so it would seem bad for now, but again, my experience is that you end up realizing that it's better to do things consistently.
For completeness, I'll mention that in Racket there's a read-syntax hack for infix-like expressions: (x . define . 1) is read as (define x 1). (And as above, it works everywhere.)
Otherwise, your idea of a wrapping macro is pretty much the only thing you can do. This doesn't make it completely hopeless though, you might have a hook into your implementation's expander that can allow you to do such things -- for example, Racket has a special macro called #%module-begin that wraps a complete module body and #%top-interaction that wraps toplevel expressions on the REPL. (Both of these are added implicitly in the two contexts.) Here's an example (I'm using Racket's define-syntax-rule for simplicity):
#lang racket/base
(provide (except-out (all-from-out racket/base)
#%module-begin #%top-interaction)
(rename-out [my-module-begin #%module-begin]
[my-top-interaction #%top-interaction]))
(define-syntax infix-def
(syntax-rules (:= begin)
[(_ (begin E ...)) (begin (infix-def E) ...)]
[(_ (x := E ...)) (define x (infix-def E) ...)]
[(_ E) E]))
(define-syntax-rule (my-module-begin E ...)
(#%module-begin (infix-def E) ...))
(define-syntax-rule (my-top-interaction . E)
(#%top-interaction . (infix-def E)))
If I put this in a file called my-lang.rkt, I can now use it as follows:
#lang s-exp "my-lang.rkt"
(x := 10)
((fib n) :=
(done? := (<= n 1))
(if done? n (+ (fib (- n 1)) (fib (- n 2)))))
(fib x)
Yes, you need to deal with a bunch of things. Two examples in the above are handling begin expressions and handling function bodies. This is obviously a very partial list -- you'll also want bodies of lambda, let, etc. But this is still better than some blind massaging, since that's just not practical as you can't really tell in advance how some random piece of code will end up. As an easy example, consider this simple macro:
(define-syntax-rule (track E)
(begin (eprintf "Evaluating: ~s\n" 'E)
E))
(x := 1)
The upshot of this is that for a proper solution, you need some way to pre-expand the code, so that you can then scan it and deal with the few known core forms in your implmenetation.
Yes, all of this is repeating work that macro expanders do, but since you're changing how expansion works, there's no way around this. (To see why it's a fundamental change, consider something like (if := 1) -- is this a conditional expression or a definition? How do you decide which one takes precedence?) For this reason, for languages with such "cute syntax", a more popular approach is to read and parse the code into plain S-expressions, and then let the actual language implementation use plain functions and macros.
Redefining define is a little complicated. See #Eli's excellent explanation.
If on the other hand, you are content with := to use set! things are a little simpler.
Here is a small example:
#lang racket
(module assignment racket
(provide (rename-out [app #%app]))
(define-syntax (app stx)
(syntax-case stx (:=)
[(_ id := expr)
(identifier? #'id)
(syntax/loc stx (set! id expr))]
[(_ . more)
(syntax/loc stx (#%app . more))])))
(require 'assignment)
(define x 41)
(x := (+ x 1))
(displayln x)
To keep the example to a single file, I used submodules (available in the prerelease version of Racket).