Is it possible for a variable transformer to work with non-literal tokens? - macros

make-variable-transformer (or make-set!-transformer, as it is called in Racket) can work with identifiers hardcoded in the literal list of the macro definition. The following example with set! invariably comes up when variable transformers are mentioned:
(make-variable-transformer
(λ (stx)
(syntax-case stx (set!)
((set! id _) ...)
(id ...))))
This is nice and all, useful for transparently integrating foreign structures with primitive operations known ahead of time, and it's a bonus that it can work through identifier syntax and rename transformers.
But what I'm wondering is if it's possible to work with syntax dynamically like:
(let-syntax ((# (make-variable-transformer
(λ (stx)
(syntax-case stx ()
((v # i) (vector? #'v) #'(vector-ref v i)))))))
(#(0 1 2) # 1))
=> 1
This doesn't work because the macro call doesn't match the template as syntax-case expects # to be in the initial position since there's no v in the literal list (and it probably assigns # to v pattern variable).
In short: Is it possible to write a syntax transformer that can accomplish this without reader extensions or overriding application, perhaps through a metamacro that rewrites the literal token list of an inner syntax-case (à la Petrofsky extraction)?
NB: The vector example itself is unimportant and I'm not interested in alternative solutions to this exact use-case.

since there's no v in the literal list (and it probably assigns # to v pattern variable).
Not really. set! is a special case that the macro expander handles specifically to make it cooperate with make-variable-transformer. But for other literals, they will fail. E.g.,
(let-syntax ((# (make-variable-transformer
(λ (stx)
(syntax-case stx (v)
((v # i) #'1))))))
(v # 1))
fails with v: unbound identifier.
The second issue with your above code is the side condition (vector? #'v). #'v is a syntax object, so (vector? #'v) will always result in #f. It's unclear what is the right behavior. For example, do you intend for:
(define v (vector 1 2 3))
(v # 1)
to work? If so, a compile-time side condition would be inappropriate, because it's not known if v is a vector at compile-time.
For your main question, the answer is no. It's not possible under the constraints that you imposed. The expansion steps are detailed here, and none of the steps looks beyond the head of the list.
But if we don't care about the constraints. I.e., overriding #%app is OK. It could work.
An issue that you need to think about is, suppose you have (a b c) where b is your kind of macro and a is a regular macro. Who should get the control first? If a should get the control first, you can override #%app to implement this kind of macro. Here's my quick implementation.
#lang racket
(require syntax/parse/define
(only-in racket [#%app racket:#%app])
(for-syntax syntax/apply-transformer))
(begin-for-syntax
(struct my-transformer (t)))
(define-syntax-parser #%app
[(_ x ...)
(define transformer
(for/first ([operand (attribute x)]
#:when (and (identifier? operand)
(my-transformer?
(syntax-local-value operand (λ () #f)))))
(syntax-local-value operand)))
(cond
[transformer (local-apply-transformer
(my-transformer-t transformer)
#'(x ...)
'expression)]
[else #'(racket:#%app x ...)])])
(define-syntax #
(my-transformer
(syntax-parser
[(v _ i) #'(vector-ref v i)])))
(define v (vector 42 1337 1729))
(v # 1) ;=> 1337
Finally, you can always override #%module-begin and simulate the macro expander. It's an overkill solution, but could be appropriate if you want more advanced features, like allowing users to customize precedence so that b is expanded before a.

Related

What is the difference between define vs define-syntax and let vs let-syntax when value is syntax-rules?

I'm in a process of implementing Hygienic macros in my Scheme implementation, I've just implemented syntax-rules, but I have this code:
(define odd?
(syntax-rules ()
((_ x) (not (even? x)))))
what should be the difference between that and this:
(define-syntax odd?
(syntax-rules ()
((_ x) (not (even? x)))))
from what I understand syntax-rules just return syntax transformer, why you can't just use define to assign that to symbol? Why I need to use define-syntax? What extra stuff that expression do?
Should first also work in scheme? Or only the second one?
Also what is the difference between let vs let-syntax and letrec vs letrec-syntax. Should (define|let|letrec)-syntax just typecheck if the value is syntax transformer?
EDIT:
I have this implementation, still using lisp macros:
;; -----------------------------------------------------------------------------
(define-macro (let-syntax vars . body)
`(let ,vars
,#(map (lambda (rule)
`(typecheck "let-syntax" ,(car rule) "syntax"))
vars)
,#body))
;; -----------------------------------------------------------------------------
(define-macro (letrec-syntax vars . body)
`(letrec ,vars
,#(map (lambda (rule)
`(typecheck "letrec-syntax" ,(car rule) "syntax"))
vars)
,#body))
;; -----------------------------------------------------------------------------
(define-macro (define-syntax name expr)
(let ((expr-name (gensym)))
`(define ,name
(let ((,expr-name ,expr))
(typecheck "define-syntax" ,expr-name "syntax")
,expr-name))))
This this code correct?
Should this code works?
(let ((let (lambda (x) x)))
(let-syntax ((odd? (syntax-rules ()
((_ x) (not (even? x))))))
(odd? 11)))
This question seems to imply some deep confusion about macros.
Let's imagine a language where syntax-rules returns some syntax transformer function (I am not sure this has to be true in RnRS Scheme, it is true in Racket I think), and where let and let-syntax were the same.
So let's write this function:
(define (f v)
(let ([g v])
(g e (i 10)
(if (= i 0)
i
(e (- i 1))))))
Which we can turn into this, of course:
(define (f v n)
(v e (i n)
(if (<= i 0)
i
(e (- i 1)))))
And I will tell you in addition that there is no binding for e or i in the environment.
What is the interpreter meant to do with this definition? Could it compile it? Could it safely infer that i can't possibly make any sense since it is used as a function and then as a number? Can it safely do anything at all?
The answer is that no, it can't. Until it knows what the argument to the function is it can't do anything. And this means that each time f is called it has to make that decision again. In particular, v might be:
(syntax-rules ()
[(_ name (var init) form ...)
(letrec ([name (λ (var)
form ...)])
(name init))]))
Under which the definition of f does make some kind of sense.
And things get worse: much worse. How about this?
(define (f v1 v2 n)
(let ([v v1])
(v e (i n)
...
(set! v (if (eq? v v1) v2 v1))
...)))
What this means is that a system like this wouldn't know what the code it was meant to interpret meant until, the moment it was interpreting it, or even after that point, as you can see from the second function above.
So instead of this horror, Lisps do something sane: they divide the process of evaluating bits of code into phases where each phase happens, conceptually, before the next one.
Here's a sequence for some imagined Lisp (this is kind of close to what CL does, since most of my knowledge is of that, but it is not intended to represent any particular system):
there's a phase where the code is turned from some sequence of characters to some object, possibly with the assistance of user-defined code;
there's a phase where that object is rewritten into some other object by user- and system-defined code (macros) – the result of this phase is something which is expressed in terms of functions and some small number of primitive special things, traditionally called 'special forms' which are known to the processes of stage 3 and 4;
there may be a phase where the object from phase 2 is compiled, and that phase may involve another set of user-defined macros (compiler macros);
there is a phase where the resulting code is evaluated.
And for each unit of code these phases happen in order, each phase completes before the next one begins.
This means that each phase in which the user can intervene needs its own set of defining and binding forms: it needs to be possible to say that 'this thing controls what happens at phase 2' for instance.
That's what define-syntax, let-syntax &c do: they say that 'these bindings and definitions control what happens at phase 2'. You can't, for instance, use define or let to do that, because at phase 2, these operations don't yet have meaning: they gain meaning (possibly by themselves being macros which expand to some primitive thing) only at phase 3. At phase 2 they are just bits of syntax which the macro is ingesting and spitting out.

Does any Lisp allow mutually recursive macros?

In Common Lisp, a macro definition must have been seen before the first use. This allows a macro to refer to itself, but does not allow two macros to refer to each other. The restriction is slightly awkward, but understandable; it makes the macro system quite a bit easier to implement, and to understand how the implementation works.
Is there any Lisp family language in which two macros can refer to each other?
What is a macro?
A macro is just a function which is called on code rather than data.
E.g., when you write
(defmacro report (x)
(let ((var (gensym "REPORT-")))
`(let ((,var ,x))
(format t "~&~S=<~S>~%" ',x ,var)
,var)))
you are actually defining a function which looks something like
(defun macro-report (system::<macro-form> system::<env-arg>)
(declare (cons system::<macro-form>))
(declare (ignore system::<env-arg>))
(if (not (system::list-length-in-bounds-p system::<macro-form> 2 2 nil))
(system::macro-call-error system::<macro-form>)
(let* ((x (cadr system::<macro-form>)))
(block report
(let ((var (gensym "REPORT-")))
`(let ((,var ,x)) (format t "~&~s=<~s>~%" ',x ,var) ,var))))))
I.e., when you write, say,
(report (! 12))
lisp actually passes the form (! 12) as the 1st argument to macro-report which transforms it into:
(LET ((#:REPORT-2836 (! 12)))
(FORMAT T "~&~S=<~S>~%" '(! 12) #:REPORT-2836)
#:REPORT-2836)
and only then evaluates it to print (! 12)=<479001600> and return 479001600.
Recursion in macros
There is a difference whether a macro calls itself in implementation or in expansion.
E.g., a possible implementation of the macro and is:
(defmacro my-and (&rest args)
(cond ((null args) T)
((null (cdr args)) (car args))
(t
`(if ,(car args)
(my-and ,#(cdr args))
nil))))
Note that it may expand into itself:
(macroexpand '(my-and x y z))
==> (IF X (MY-AND Y Z) NIL) ; T
As you can see, the macroexpansion contains the macro being defined.
This is not a problem, e.g., (my-and 1 2 3) correctly evaluates to 3.
However, if we try to implement a macro using itself, e.g.,
(defmacro bad-macro (code)
(1+ (bad-macro code)))
you will get an error (a stack overflow or undefined function or ...) when you try to use it, depending on the implementation.
Here's why mutually recursive macros can't work in any useful way.
Consider what a system which wants to evaluate (or compile) Lisp code for a slightly simpler Lisp than CL (so I'm avoiding some of the subtleties that happen in CL), such as the definition of a function, needs to do. It has a very small number of things it knows how to do:
it knows how to call functions;
it knows how to evaluate a few sorts of literal objects;
it has some special rules for a few sorts of forms – what CL calls 'special forms', which (again in CL-speak) are forms whose car is a special operator;
finally it knows how to look to see whether forms correspond to functions which it can call to transform the code it is trying to evaluate or compile – some of these functions are predefined but additional ones can be defined.
So the way the evaluator works is by walking over the thing it needs to evaluate looking for these source-code-transforming things, aka macros (the last case), calling their functions and then recursing on the results until it ends up with code which has none left. What's left should consist only of instances of the first three cases, which it then knows how to deal with.
So now think about what the evaluator has to do if it is evaluating the definition of the function corresponding to a macro, called a. In Cl-speak it is evaluating or compiling a's macro function (which you can get at via (macro-function 'a) in CL). Let's assume that at some point there is a form (b ...) in this code, and that b is known also to correspond to a macro.
So at some point it comes to (b ...), and it knows that in order to do this it needs to call b's macro function. It binds suitable arguments and now it needs to evaluate the definition of the body of that function ...
... and when it does this it comes across an expression like (a ...). What should it do? It needs to call a's macro function, but it can't, because it doesn't yet know what it is, because it's in the middle of working that out: it could start trying to work it out again, but this is just a loop: it's not going to get anywhere where it hasn't already been.
Well, there's a horrible trick you could do to avoid this. The infinite regress above happens because the evaluator is trying to expand all of the macros ahead of time, and so there's no base to the recursion. But let's assume that the definition of a's macro function has code which looks like this:
(if <something>
(b ...)
<something not involving b>)
Rather than doing the expand-all-the-macros-first trick, what you could do is to expand only the macros you need, just before you need their results. And if <something> turned out always to be false, then you never need to expand (b ...), so you never get into this vicious loop: the recursion bottoms out.
But this means you must always expand macros on demand: you can never do it ahead of time, and because macros expand to source code you can never compile. In other words a strategy like this is not compatible with compilation. It also means that if <something> ever turns out to be true then you'll end up in the infinite regress again.
Note that this is completely different to macros which expand to code which involves the same macro, or another macro which expands into code which uses it. Here's a definition of a macro called et which does that (it doesn't need to do this of course, this is just to see it happen):
(defmacro et (&rest forms)
(if (null forms)
't
`(et1 ,(first forms) ,(rest forms))))
(defmacro et1 (form more)
(let ((rn (make-symbol "R")))
`(let ((,rn ,form))
(if ,rn
,rn
(et ,#more)))))
Now (et a b c) expands to (et1 a (b c)) which expands to (let ((#:r a)) (if #:r #:r (et b c))) (where all the uninterned things are the same thing) and so on until you get
(let ((#:r a))
(if #:r
#:r
(let ((#:r b))
(if #:r
#:r
(let ((#:r c))
(if #:r
#:r
t))))))
Where now not all the uninterned symbols are the same
And with a plausible macro for let (let is in fact a special operator in CL) this can get turned even further into
((lambda (#:r)
(if #:r
#:r
((lambda (#:r)
(if #:r
#:r
((lambda (#:r)
(if #:r
#:r
t))
c)))
b)))
a)
And this is an example of 'things the system knows how to deal with': all that's left here is variables, lambda, a primitive conditional and function calls.
One of the nice things about CL is that, although there is a lot of useful sugar, you can still poke around in the guts of things if you like. And in particular, you still see that macros are just functions that transform source code. The following does exactly what the defmacro versions do (not quite: defmacro does the necessary cleverness to make sure the macros are available early enough: I'd need to use eval-when to do that with the below):
(setf (macro-function 'et)
(lambda (expression environment)
(declare (ignore environment))
(let ((forms (rest expression)))
(if (null forms)
't
`(et1 ,(first forms) ,(rest forms))))))
(setf (macro-function 'et1)
(lambda (expression environment)
(declare (ignore environment))
(destructuring-bind (_ form more) expression
(declare (ignore _))
(let ((rn (make-symbol "R")))
`(let ((,rn ,form))
(if ,rn
,rn
(et ,#more)))))))
There have been historic Lisp systems that allow this, at least in interpreted code.
We can allow a macro to use itself for its own definition, or two or more macros to mutually use each other, if we follow an extremely late expansion strategy.
That is to say, our macro system expands a macro call just before it is evaluated (and does that each time that same expression is evaluated).
(Such a macro expansion strategy is good for interactive development with macros. If you fix a buggy macro, then all code depending on it automatically benefits from the change, without having to be re-processed in any way.)
Under such a macro system, suppose we have a conditional like this:
(if (condition)
(macro1 ...)
(macro2 ...))
When (condition) is evaluated, then if it yields true, (macro1 ...) is evaluated, otherwise (macro2 ...). But evaluation also means expansion. Thus only one of these two macros is expanded.
This is the key to why mutual references among macros can work: we are able rely on the conditional logic to give us not only conditional evaluation, but conditional expansion also, which then allows the recursion to have ways of terminating.
For example, suppose macro A's body of code is defined with the help of macro B, and vice versa. And when a particular invocation of A is executed, it happens to hit the particular case that requires B, and so that B call is expanded by invocation of macro B. B also hits the code case that depends on A, and so it recurses into A to obtain the needed expansion. But, this time, A is called in a way that avoids requiring, again, an expansion of B; it avoids evaluating any sub-expression containing the B macro. Thus, it calculates the expansion, and returns it to B, which then calculates its expansion returns to the outermost A. A finally expands and the recursion terminates; all is well.
What blocks macros from using each other is the unconditional expansion strategy: the strategy of fully expanding entire top-level forms after they are read, so that the definitions of functions and macros contain only expanded code. In that situation there is no possibility of conditional expansion that would allow for the recursion to terminate.
Note, by the way, that a macro system which expands late doesn't recursively expand macros in a macro expansion. Suppose (mac1 x y) expands into (if x (mac2 y) (mac3 y)). Well, that's all the expansion that is done for now: the if that pops out is not a macro, so expansion stops, and evaluation proceeds. If x yields true, then mac2 is expanded, and mac3 is not.

Generating logging information using Racket

Background:
I intend to generate debug messages for the code I'm developing. I wrote a macro to avoid writing the logging calls in each function. I know this limits the ability to generate more customized debug messages, but in return it isolates logging from code. And that is what I'm aiming for. This macro approach has other drawbacks too, for example it limits creation of function bindings to this macro only, but I think I can live with that.
Following is the definition of the macro and an example demonstrating its usage.
(define-syntax (define-func stx)
(syntax-case stx ()
[(define-func (func-name args ...) body1 body2 ...)
(if (and (identifier? #'func-name)
(andmap symbol? (syntax->datum #'(args ...))))
(syntax (define (func-name args ...)
(log-debug (format "Function-name ~a:" (syntax-e #'func-name)) (list args ...))
body1
body2 ...))
(raise-syntax-error 'define-func "not an identifier" stx))]
[else (raise-syntax-error 'define-func "bad syntax" stx)]))
(define-func (last l)
(cond [(null? l) null]
[(null? (rest l)) (first l)]
[else (last (rest l))]))
(define-func (main)
(last (list 1 2 3 4 5 6 7 8 9))
(logger))
log-debug and logger are defined in separate module
The output produced is somewhat like following:
Function-name last:
args:
:-> (7 8 9)
Function-name last:
args:
:-> (8 9)
Function-name last:
args:
:-> (9)
Now I want to make it more readable. By readability I mean to provide some kind of indentation so that the person reading the log could make sense of call flow. For example something like following:
Function-name last:
args:
:-> (7 8 9)
Function-name last:
args:
:-> (8 9)
Function-name last:
args:
:-> (9)
It is easier to figure out who called whom and so forth. I have an idea that can do this. It involves a variable that keeps track of indentation then after logging the function name I will increase the indent and after evaluation of body and before returning the value decrements the value. Something like following:
(define indent 0)
(define-syntax (define-func stx)
(syntax-case stx ()
[ (... ...)
(...
(log-debug ...)
(increment indent)
(let [(retval (body1 body2 ...)]
(decrease indent)
retval))]))
increment and decrease increases and decreases indentation respectively.
Problem:
It works even for function that returns void. I'm not sure whether its the correct behavior. In racket void is a special value, but I'm not sure that creating a binding to void is right way to go.
Is there any better way to achieve the same? If not are there any problems in this design? I'm open to any idea/change as long as they keep the logging and code separate.
thanks for the help!
I have several suggestions for you:
It's probably better to use a parameter instead of a variable, for "global" stuff like your indentation level, since the original value is restored for you at the end of the parameterize expression.
All those raise-syntax-error checks you have in your macro are totally superfluous: syntax-case already provides guards (also known as fenders) that allow you to do any validation of macro "arguments" necessary:
(define-syntax (define-func stx)
(syntax-case stx ()
[(_ (func-name args ...) body1 body2 ...)
(andmap identifier? (syntax->list #'(func-name args ...)))
#'(define (func-name args ...)
(log-debug (format "Function-name ~a:" 'func-name)
(list args ...))
body1
body2 ...)]))
I've also fixed up your code in several places, as you can see above:
I used (_ ...) instead of (define-func ...), since in syntax-case (unlike syntax-rules), the latter will actually bind a pattern variable called define-func, which will affect any recursive macro calls you may want to do (I'll grant that you don't have one here, but it's a good habit to have anyway).
Rather than completely flatten the #'(args ...) in the guard, I just turned it into a list of syntax objects so you can test using identifier?. This is more intention-revealing than testing using symbol?, and allows us to also test func-name in the same expression.
You don't need to use (syntax-e #'func-name) inside the expanded code! Just quote it.

SICP: Can or be defined in lisp as a syntactic transformation without gensym?

I am trying to solve the last part of question 4.4 of the Structure and Interpretation of computer programming; the task is to implement or as a syntactic transformation. Only elementary syntactic forms are defined; quote, if, begin, cond, define, apply and lambda.
(or a b ... c) is equal to the first true value or false if no value is true.
The way I want to approach it is to transform for example (or a b c) into
(if a a (if b b (if c c false)))
the problem with this is that a, b, and c would be evaluated twice, which could give incorrect results if any of them had side-effects. So I want something like a let
(let ((syma a))
(if syma syma (let ((symb b))
(if symb symb (let ((symc c))
(if (symc symc false)) )) )) )
and this in turn could be implemented via lambda as in Exercise 4.6. The problem now is determining symbols syma, symb and symc; if for example the expression b contains a reference to the variable syma, then the let will destroy the binding. Thus we must have that syma is a symbol not in b or c.
Now we hit a snag; the only way I can see out of this hole is to have symbols that cannot have been in any expression passed to eval. (This includes symbols that might have been passed in by other syntactic transformations).
However because I don't have direct access to the environment at the expression I'm not sure if there is any reasonable way of producing such symbols; I think Common Lisp has the function gensym for this purpose (which would mean sticking state in the metacircular interpreter, endangering any concurrent use).
Am I missing something? Is there a way to implement or without using gensym? I know that Scheme has it's own hygenic macro system, but I haven't grokked how it works and I'm not sure whether it's got a gensym underneath.
I think what you might want to do here is to transform to a syntactic expansion where the evaluation of the various forms aren't nested. You could do this, e.g., by wrapping each form as a lambda function and then the approach that you're using is fine. E.g., you can do turn something like
(or a b c)
into
(let ((l1 (lambda () a))
(l2 (lambda () b))
(l3 (lambda () c)))
(let ((v1 (l1)))
(if v1 v1
(let ((v2 (l2)))
(if v2 v2
(let ((v3 (l3)))
(if v3 v3
false)))))))
(Actually, the evaluation of the lambda function calls are still nested in the ifs and lets, but the definition of the lambda functions are in a location such that calling them in the nested ifs and lets doesn't cause any difficulty with captured bindings.) This doesn't address the issue of how you get the variables l1–l3 and v1–v3, but that doesn't matter so much, none of them are in scope for the bodies of the lambda functions, so you don't need to worry about whether they appear in the body or not. In fact, you can use the same variable for all the results:
(let ((l1 (lambda () a))
(l2 (lambda () b))
(l3 (lambda () c)))
(let ((v (l1)))
(if v v
(let ((v (l2)))
(if v v
(let ((v (l3)))
(if v v
false)))))))
At this point, you're really just doing loop unrolling of a more general form like:
(define (functional-or . functions)
(if (null? functions)
false
(let ((v ((first functions))))
(if v v
(functional-or (rest functions))))))
and the expansion of (or a b c) is simply
(functional-or (lambda () a) (lambda () b) (lambda () c))
This approach is also used in an answer to Why (apply and '(1 2 3)) doesn't work while (and 1 2 3) works in R5RS?. And none of this required any GENSYMing!
In SICP you are given two ways of implementing or. One that handles them as special forms which is trivial and one as derived expressions. I'm unsure if they actually thought you would see this as a problem, but you can do it by implementing gensym or altering variable? and how you make derived variables like this:
;; a unique tag to identify special variables
(define id (vector 'id))
;; a way to make such variable
(define (make-var x)
(list id x))
;; redefine variable? to handle macro-variables
(define (variable? exp)
(or (symbol? exp)
(tagged-list? exp id)))
;; makes combinations so that you don't evaluate
;; every part twice in case of side effects (set!)
(define (or->combination terms)
(if (null? terms)
'false
(let ((tmp (make-var 'tmp)))
(list (make-lambda (list tmp)
(list (make-if tmp
tmp
(or->combination (cdr terms)))))
(car terms)))))
;; My original version
;; This might not be good since it uses backquotes not introduced
;; until chapter 5 and uses features from exercise 4.6
;; Though, might be easier to read for some so I'll leave it.
(define (or->combination terms)
(if (null? terms)
'false
(let ((tmp (make-var 'tmp)))
`(let ((,tmp ,(car terms)))
(if ,tmp
,tmp
,(or->combination (cdr terms)))))))
How it works is that make-var creates a new list every time it is called, even with the same argument. Since it has id as it's first element variable? will identify it as a variable. Since it's a list it will only match in variable lookup with eq? if it is the same list, so several nested or->combination tmp-vars will all be seen as different by lookup-variable-value since (eq? (list) (list)) => #f and special variables being lists they will never shadow any symbol in code.
This is influenced by eiod, by Al Petrofsky, which implements syntax-rules in a similar manner. Unless you look at others implementations as spoilers you should give it a read.

Scheme macro triggered by keyword which is not the head of a list

Suppose I want to trigger a Scheme macro on something other than the first item in an s-expression. For example, suppose that I wanted to replace define with an infix-style :=, so that:
(a := 5) -> (define a 5)
((square x) := (* x x)) -> (define (square x) (* x x))
The actual transformation seems to be quite straightforward. The trick will be getting Scheme to find the := expressions and macro-expand them. I've thought about surrounding large sections of code that use the infix syntax with a standard macro, maybe: (with-infix-define expr1 expr2 ...), and having the standard macro walk through the expressions in its body and perform any necessary transformations. I know that if I take this approach, I'll have to be careful to avoid transforming lists that are actually supposed to be data, such as quoted lists, and certain sections of quasiquoted lists. An example of what I envision:
(with-infix-define
((make-adder n) := (lambda (m) (+ n m)))
((foo) :=
(add-3 := (make-adder 3))
(add-6 := (make-adder 6))
(let ((a 5) (b 6))
(+ (add-3 a) (add-6 b))))
(display (foo))
(display '(This := should not be transformed))
So, my question is two-fold:
If I take the with-infix-define route, do I have to watch out for any stumbling blocks other than quote and quasiquote?
I feel a bit like I'm reinventing the wheel. This type of code walk seems like exactly what standard macro expanding systems would have to do - the only difference is that they only look at the first item in a list when deciding whether or not to do any code transformation. Is there any way I can just piggyback on existing systems?
Before you continue with this, it's best to think things over thoroughly -- IME you'd often find that what you really want a reader-level handling of := as an infix syntax. That will of course mean that it's also infix in quotations etc, so it would seem bad for now, but again, my experience is that you end up realizing that it's better to do things consistently.
For completeness, I'll mention that in Racket there's a read-syntax hack for infix-like expressions: (x . define . 1) is read as (define x 1). (And as above, it works everywhere.)
Otherwise, your idea of a wrapping macro is pretty much the only thing you can do. This doesn't make it completely hopeless though, you might have a hook into your implementation's expander that can allow you to do such things -- for example, Racket has a special macro called #%module-begin that wraps a complete module body and #%top-interaction that wraps toplevel expressions on the REPL. (Both of these are added implicitly in the two contexts.) Here's an example (I'm using Racket's define-syntax-rule for simplicity):
#lang racket/base
(provide (except-out (all-from-out racket/base)
#%module-begin #%top-interaction)
(rename-out [my-module-begin #%module-begin]
[my-top-interaction #%top-interaction]))
(define-syntax infix-def
(syntax-rules (:= begin)
[(_ (begin E ...)) (begin (infix-def E) ...)]
[(_ (x := E ...)) (define x (infix-def E) ...)]
[(_ E) E]))
(define-syntax-rule (my-module-begin E ...)
(#%module-begin (infix-def E) ...))
(define-syntax-rule (my-top-interaction . E)
(#%top-interaction . (infix-def E)))
If I put this in a file called my-lang.rkt, I can now use it as follows:
#lang s-exp "my-lang.rkt"
(x := 10)
((fib n) :=
(done? := (<= n 1))
(if done? n (+ (fib (- n 1)) (fib (- n 2)))))
(fib x)
Yes, you need to deal with a bunch of things. Two examples in the above are handling begin expressions and handling function bodies. This is obviously a very partial list -- you'll also want bodies of lambda, let, etc. But this is still better than some blind massaging, since that's just not practical as you can't really tell in advance how some random piece of code will end up. As an easy example, consider this simple macro:
(define-syntax-rule (track E)
(begin (eprintf "Evaluating: ~s\n" 'E)
E))
(x := 1)
The upshot of this is that for a proper solution, you need some way to pre-expand the code, so that you can then scan it and deal with the few known core forms in your implmenetation.
Yes, all of this is repeating work that macro expanders do, but since you're changing how expansion works, there's no way around this. (To see why it's a fundamental change, consider something like (if := 1) -- is this a conditional expression or a definition? How do you decide which one takes precedence?) For this reason, for languages with such "cute syntax", a more popular approach is to read and parse the code into plain S-expressions, and then let the actual language implementation use plain functions and macros.
Redefining define is a little complicated. See #Eli's excellent explanation.
If on the other hand, you are content with := to use set! things are a little simpler.
Here is a small example:
#lang racket
(module assignment racket
(provide (rename-out [app #%app]))
(define-syntax (app stx)
(syntax-case stx (:=)
[(_ id := expr)
(identifier? #'id)
(syntax/loc stx (set! id expr))]
[(_ . more)
(syntax/loc stx (#%app . more))])))
(require 'assignment)
(define x 41)
(x := (+ x 1))
(displayln x)
To keep the example to a single file, I used submodules (available in the prerelease version of Racket).