trying to understand require in language extension - racket

I'm trying to define a new language in racket, let's call it wibble. Wibble will allow modules to be loaded so it has to translate it's forms to Racket require forms. But I'm having trouble getting require to work when used in a language extension. I eventually tracked down my problems to the following strange behaviour.
Here's my reader which redefines read and read-syntax
=== wibble/lang/reader.rkt ===
#lang racket/base
(provide (rename-out (wibble-read read) (wibble-read-syntax read-syntax)))
(define (wibble-read in)
(wibble-read-syntax #f in))
(define (wibble-read-syntax src in)
#`(module #,(module-name src) wibble/lang
#,#(read-all src in)))
(define (module-name src)
(if (path? src)
(let-values (((base name dir?) (split-path src)))
(string->symbol (path->string (path-replace-suffix name #""))))
'anonymous-module))
(define (read-all src in)
(let loop ((all '()))
(let ((obj (read-syntax src in)))
(if (eof-object? obj)
(reverse all)
(loop (cons obj all))))))
and here's my much simplified language module, this introduces (require racket/base) into each wibble module
=== wibble/lang.rkt ===
#lang racket/base
(require (for-syntax racket/base))
(provide (rename-out (wibble-module-begin #%module-begin)) #%app #%datum #%top)
(define-syntax wibble-module-begin
(lambda (stx)
(syntax-case stx ()
((_ x ...) #`(#%module-begin (require #,(datum->syntax stx 'racket/base)) x ...)))))
With the above code then this wibble code 'works', i.e. there are no errors
#lang wibble
(cons 1 2)
(cons 3 4)
but the following
#lang wibble
(cons 1 2)
gives error message cons: unbound identifier in module in: cons
Really I'm just looking for an explanation as to what going on. I'm sure the difference is related to this from the racket docs (Racket Reference 3.1)
If a single form is provided, then it is partially expanded in a
module-begin context. If the expansion leads to #%plain-module-begin,
then the body of the #%plain-module-begin is the body of the module.
If partial expansion leads to any other primitive form, then the form
is wrapped with #%module-begin using the lexical context of the module
body; this identifier must be bound by the initial module-path import,
and its expansion must produce a #%plain-module-begin to supply the
module body. Finally, if multiple forms are provided, they are wrapped
with #%module-begin, as in the case where a single form does not
expand to #%plain-module-begin.
but even with that I don't understand why having a single form makes any difference, it's seems to be somthing to do with the timing of partial expansion but I'm not really sure. Nor do I understand why Racket treats a single form as a special case.
Incidentally I can fix the problem with a slight modification to my reader
(define (wibble-read-syntax src in)
#`(module #,(module-name src) wibble/lang
#,#(read-all src in) (void)))
Hard-coding a (void) form means I always have more than one form and eveything works.
Sorry for the long post, I'm just looking for some understanding of how this stuff works.

Alright, I think that I've figured it out.
Your intuition is correct in that the problem lies within the timing of the partial expansion of the single-form module body. Inside of your reader.rkt file, you produce a (module ...) form. As the quoted excerpt from your question states, the forms ... portion of this is then treated specially, since there is only one. Let's take a look at an excerpt from the documentation on partial expansion:
As a special case, when expansion would otherwise add an #%app, #%datum, or #%top identifier to an expression, and when the binding turns out to be the primitive #%app, #%datum, or #%top form, then expansion stops without adding the identifier.
I am almost certain that the partial expansion which occurs at this point does something to the cons identifier. This is the one part that I remain unsure of... my gut tells me that what's happening is that the partial expansion is attempting to find the binding for the cons identifier (since it is the first part of the parentheses, the identifier could be bound to a macro which should be expanded, so that needs to be checked) but is unable to, so it throws a tantrum. Note that even if cons has no phase 1 (syntax-expansion time) binding, the macro expander still expects there to be a phase 0 (runtime) binding for the identifier (among other things, this helps the expander remain hygienic). Because all of this partial expansion happens to the body of your (module ...) form (which is done before your (#%module-begin ...) form where you inject the (#%require ...) form), cons has no binding during the expansion, so the expansion, I believe, fails.
Nevertheless, a naive fix for your problem is to rewrite wibble-read-syntax as follows:
(define (wibble-read-syntax src in)
(let* ((read-in (read-all src in))
(in-stx (and (pair? read-in) (car read-in))))
#`(module #,(module-name src) wibble/lang
(require #,(datum->syntax in-stx 'racket/base))
#,#read-in))
You can then remove the (#%require ...) form from your (#%module-begin ...) macro.
That's not, in my opinion, the best way to fix the issue, however. As a matter of cleanliness, hard-coding in a require form like you've done in wibble/lang.rkt would make Eli Barzilay and co. cry. A much simpler way to do what you are trying to do is by updating your lang.rkt file to something like so:
=== wibble/lang.rkt ===
#lang racket/base
(require (for-syntax racket/base))
(provide (rename-out (wibble-module-begin #%module-begin))
(except-out (all-from-out racket/base) #%module-begin #%app #%datum #%top)
#%app #%datum #%top)
(define-syntax wibble-module-begin
(lambda (stx)
(syntax-case stx ()
((_ x ...) #`(#%module-begin x ...)))))
Writing in this convention removes the need for any hard-coded (require ...) forms and prevents subtle bugs like the one you've unearthed from occuring. If you are confused why this works, remember that you've already provided the #%module-begin identifier using this file, which is subsequently bound in all #lang wibble files. In principle, there is no limit on what identifiers you can bind in this fashion. If you would like some further reading, here's a shameless self-advertisement for a blog post I wrote a little while back on the subject.
I hope I've helped.

The problem is with the require (though I'm not sure I 100% understand all the behavior).
(require X) imports bindings from X with the lexical context of #'X. #'X here has the context of stx, which is the entire #'(module-begin x ...), which is not the context you want. You want the context of one of the cons expressions, i.e., one of the #'xs.
Something like this should work:
(define-syntax wibble-module-begin
(lambda (stx)
(syntax-case stx ()
[(_) #'(#%module-begin)]
[(m x y ...)
#`(#%module-begin
(require #,(datum->syntax #'x 'racket/base))
x y ...)])))
Though, as #belph warned, there's probably a more idiomatic way to accomplish what you want.
The behavior of your original program, and as you intuited, likely has to do with module's different treatment of single and multi sub-forms, but I think the "working" case might be an accident and could be a bug in the racket compiler.

Related

How do defines and syntax macros interact in Racket?

I'm new to Racket and there's something about syntax macros I don't understand. I have these two programs:
This one, which executes correctly:
#lang racket
(define-syntax-rule (create name) (define name 2))
(create x)
(displayln (+ x 3))
And this one, which complains that the identifier x is unbound:
#lang racket
(define-syntax-rule (create) (define x 2))
(create)
(displayln (+ x 3))
With a naive substitution approach (such as C/C++ macros) these two programs would behave identically, but evidently they do not. It seems that identifiers that appear in the invocation of a syntax macro are somehow "special" and defines that use them behave differently to defines that do not. Additionally, there is the struct syntax macro in the Racket standard library which defines several variables that are not explicitly named in its invocation, for example:
(struct employee (first-name last-name))
Will define employee? and employee-first-name, neither of which were directly named in the invocation.
What is going on here, and it can be worked around so that I could create a custom version of struct?
The problem with the naive substitution is unintentional capturing. Racket macros by default are hygienic, which means it avoid this problem. See https://en.wikipedia.org/wiki/Hygienic_macro for more details.
That being said, the macro system supports unhygienic macros too. struct is an example of an unhygienic macro. But you need to put a bit more effort to get unhygienic macros working.
For example, your second version of create could be written as follows:
#lang racket
(require syntax/parse/define)
(define-syntax-parse-rule (create)
#:with x (datum->syntax this-syntax 'x)
(define x 2))
(create)
(displayln (+ x 3))

Is it possible to write a function that would take any macro and turn it into a function so that it can be passed as an argument to another function?

AND and OR are macros and since macros aren't first class in scheme/racket they cannot be passed as arguments to other functions. A partial solution is to use and-map or or-map. Is it possible to write a function that would take arbitrary macro and turn it into a function so that it can be passed as an argument to another function? Are there any languages that have first class macros?
In general, no. Consider that let is (or could be) implemented as a macro on top of lambda:
(let ((x 1))
(foo x))
could be a macro that expands to
((lambda (x) (foo x)) 1)
Now, what would it look like to convert let to a function? Clearly it is nonsense. What would its inputs be? Its return value?
Many macros will be like this. In fact, any macro that could be routinely turned into a function without losing any functionality is a bad macro! Such a macro should have been a function to begin with.
I agree with #amalloy. If something is written as a macro, it probably does something that functions can't do (e.g., introduce bindings, change evaluation order). So automatically converting arbitrary macro into a function is a really bad idea even if it is possible.
Is it possible to write a function that would take arbitrary macro and turn it into a function so that it can be passed as an argument to another function?
No, but it is somewhat doable to write a macro that would take some macro and turn it into a function.
#lang racket
(require (for-syntax racket/list))
(define-syntax (->proc stx)
(syntax-case stx ()
[(_ mac #:arity arity)
(with-syntax ([(args ...) (generate-temporaries (range (syntax-e #'arity)))])
#'(λ (args ...) (mac args ...)))]))
((->proc and #:arity 2) 42 12)
(apply (->proc and #:arity 2) '(#f 12))
((->proc and #:arity 2) #f (error 'not-short-circuit))
You might also be interested in identifier macro, which allows us to use an identifier as a macro in some context and function in another context. This could be used to create a first class and/or which short-circuits when it's used as a macro, but could be passed as a function value in non-transformer position.
On the topic of first class macro, take a look at https://en.wikipedia.org/wiki/Fexpr. It's known to be a bad idea.
Not in the way you probably expect
To see why, here is a way of thinking about macros: A macro is a function which takes a bit of source code and turns it into another bit of source code: the expansion of the macro. In other words a macro is a function whose domain and range are source code.
Once the source code is fully expanded, then it's fed to either an evaluator or a compiler. Let's assume it's fed to a compiler because it makes the question easier to answer: a compiler itself is simply a function whose domain is source code and whose range is some sequence of instructions for a machine (which may or may not be a real machine) to execute. Those instructions might include things like 'call this function on these arguments'.
So, what you are asking is: can the 'this function' in 'call this function on these arguments' be some kind of macro? Well, yes, it could be, but whatever source code it is going to transform certainly can not be the source code of the program you are executing, because that is gone: all that's left is the sequence of instructions that was the return value of the compiler.
So you might say: OK, let's say we disallow compilers: can we do it now? Well, leaving aside that 'disallowing compilers' is kind of a serious limitation, this was, in fact, something that very old dialects of Lisp sort-of did, using a construct called a FEXPR, as mentioned in another answer. It's important to realise that FEXPRs existed because people had not yet invented macros. Pretty soon, people did invent macros, and although FEXPRs and macros coexisted for a while – mostly because people had written code which used FEXPRs which they wanted to keep running, and because writing macros was a serious pain before things like backquote existed – FEXPRs died out. And they died out because they were semantically horrible: even by the standards of 1960s Lisps they were semantically horrible.
Here's one small example of why FEXPRs are so horrible: Let's say I write this function in a language with FEXPRs:
(define (foo f g x)
(apply f (g x)))
Now: what happens when I call foo? In particular, what happens if f might be a FEXPR?. Well, the answer is that I can't compile foo at all: I have to wait until run-time and make some on-the-fly decision about what to do.
Of course this isn't what these old Lisps with FEXPRs probably did: they would just silently have assumed that f was a normal function (which they would have called an EXPR) and compiled accordingly (and yes, even very old Lisps had compilers). If you passed something which was a FEXPR you just lost: either the thing detected that, or more likely it fall over horribly or gave you some junk answer.
And this kind of horribleness is why macros were invented: macros provide a semantically sane approach to processing Lisp code which allows (eventually, this took a long time to actually happen) minor details like compilation being possible at all, code having reasonable semantics and compiled code having the same semantics as interpreted code. These are features people like in their languages, it turns out.
Incidentally, in both Racket and Common Lisp, macros are explicitly functions. In Racket they are functions which operate on special 'syntax' objects because that's how you get hygiene, but in Common Lisp, which is much less hygienic, they're just functions which operate on CL source code, where the source code is simply made up of lists, symbols &c.
Here's an example of this in Racket:
> (define foo (syntax-rules ()
[(_ x) x]))
> foo
#<procedure:foo>
OK, foo is now just an ordinary function. But it's a function whose domain & range are Racket source code: it expects a syntax object as an argument and returns another one:
> (foo 1)
; ?: bad syntax
; in: 1
; [,bt for context]
This is because 1 is not a syntax object.
> (foo #'(x 1))
#<syntax:readline-input:5:10 1>
> (syntax-e (foo #'(x 1)))
1
And in CL this is even easier to see: Here's a macro definition:
(defmacro foo (form) form)
And now I can get hold of the macro's function and call it on some CL source code:
> (macro-function 'foo)
#<Function foo 4060000B6C>
> (funcall (macro-function 'foo) '(x 1) nil)
1
In both Racket and CL, macros are, in fact, first-class (or, in the case of Racket: almost first-class, I think): they are functions which operate on source code, which itself is first-class: you can write Racket and CL programs which construct and manipulate source code in arbitrary ways: that's what macros are in these languages.
In the case of Racket I have said 'almost first-class', because I can't see a way, in Racket, to retrieve the function which sits behind a macro defined with define-syntax &c.
I've created something like this in Scheme, it's macro that return lambda that use eval to execute the macro:
(define-macro (macron m)
(let ((x (gensym)))
`(lambda (,x)
(eval `(,',m ,#,x)))))
Example usage:
;; normal eval
(define x (map (lambda (x)
(eval `(lambda ,#x)))
'(((x) (display x)) ((y) (+ y y)))))
;; using macron macro
(define x (map (macron lambda)
'(((x) (display x)) ((y) (+ y y)))))
and x in both cases is list of two functions.
another example:
(define-macro (+++ . args)
`(+ ,#args))
((macron +++) '(1 2 3))

Scheme can't find function inside macro while compile

I have sample code like this:
#!/usr/bin/guile -s
!#
(define (process body)
(list 'list (map (lambda (lst)
(list 'quote (car lst)))
body)))
(defmacro macro (body)
(list 'quote (process body)))
(display (macro ((foo bar) (bar baz))))
(newline)
it run but I've got error from compiler
ERROR: Unbound variable: process
functions inside macros should be allowed, why I got this error?
Functions inside macros are allowed in Guile and in most other Scheme dialects.
However, the crucial question is: Which functions are available for a macro to call out during the expansion process?
Think of it this way: When the compiler is processing your code, it is first focused on turning your source code into something that can be run at some point in the future. But the compiler might not necessarily be able to execute those same functions right now while it is compiling them, at the same time that your macro is running and expanding the source code in general.
Why wouldn't such a function be available? Well, one example would be: What if the function's body used the macro you are in the midst of defining? Then you would have a little chicken/egg problem. The function would need to run the macro to be compiled (since the macro use in the body needs to be expanded, at compile-time) ... But the macro would need the compiled function available in order to even run!
(Furthermore, there might be some functions that you only want to be available at compile-time, as a helper for your macros, but that you do not want to be available at run-time, so that it will not be included in your program executable when you deploy it, as that would waste space in the deployed binary.)
One of my favorite papers describing this problem, and the particular solution adopted by MzScheme (now known as Racket), is the "You Want It When" paper by Matthew Flatt.
So, this is a problem that any Scheme dialect with a procedural macro system has to deal with in some way, and Guile is no exception.
In Guile's case, one fix that is directly documented in the Guile manual is to use the eval-when special form, which allows you to specify at what phases a particular definition is meant to be available.
(The "You Want It When" paper referenced above describes some problems with eval-when, but since it is what the Guile manual documents, I'm going to stick with it for now. I do recommend that after you understand eval-when, that you then look into Racket's solution, and see if Guile offers anything similar.)
So in your case, since you want the process function to be available at compile-time (for use in the macro definition), you could write:
#!/usr/bin/guile -s
!#
(eval-when (expand)
(define (process body)
(list 'list (map (lambda (lst)
(list 'quote (car lst)))
body))))
(defmacro macro (body)
(list 'quote (process body)))
(display (macro ((foo bar) (bar baz))))
(newline)

How to call other macros from a Chicken Scheme macro?

I'm trying to move from Common Lisp to Chicken Scheme, and having plenty of problems.
My current problem is this: How can I write a macro (presumably using define-syntax?) that calls other macros?
For example, in Common Lisp I could do something like this:
(defmacro append-to (var value)
`(setf ,var (append ,var ,value)))
(defmacro something-else ()
(let ((values (list))
(append-to values '(1)))))
Whereas in Scheme, the equivalent code doesn't work:
(define-syntax append-to
(syntax-rules ()
((_ var value)
(set! var (append var value)))))
(define-syntax something-else
(syntax-rules ()
((_)
(let ((values (list)))
(append-to values '(1))))))
The append-to macro cannot be called from the something-else macro. I get an error saying the append-to "variable" is undefined.
According to all the information I've managed to glean from Google and other sources, macros are evaluated in a closed environment without access to other code. Essentially, nothing else exists - except built-in Scheme functions and macros - when the macro is evaluated. I have tried using er-macro-transformer, syntax-case (which is now deprecated in Chicken anyway) and even the procedural-macros module.
Surely the entire purpose of macros is that they are built upon other macros, to avoid repeating code. If macros must be written in isolation, they're pretty much useless, to my mind.
I have investigated other Scheme implementations, and had no more luck. Seems it simply cannot be done.
Can someone help me with this, please?
It looks like you're confusing expansion-time with run-time. The syntax-rules example you give will expand to the let+set, which means the append will happen at runtime.
syntax-rules simply rewrites input to given output, expanding macros until there's nothing more to expand. If you want to actually perform some computation at expansion time, the only way to do that is with a procedural macro (this is also what happens in your defmacro CL example).
In Scheme, evaluation levels are strictly separated (this makes separate compilation possible), so a procedure can use macros, but the macros themselves can't use the procedures (or macros) defined in the same piece of code. You can load procedures and macros from a module for use in procedural macros by using use-for-syntax. There's limited support for defining things to run at syntax expansion time by wrapping them in begin-for-syntax.
See for example this SO question or this discussion on the ikarus-users mailing list. Matthew Flatt's paper composable and compilable macros explains the theory behind this in more detail.
The "phase separation" thinking is relatively new in the Scheme world (note that the Flatt paper is from 2002), so you'll find quite a few people in the Scheme community who are still a bit confused about it. The reason it's "new" (even though Scheme has had macros for a long long time) is that procedural macros have only become part of the standard since R6RS (and reverted in R7RS because syntax-case is rather controversial), so the need to rigidly specify them hasn't been an issue until now. For more "traditional" Lispy implementations of Scheme, where compile-time and run-time are all mashed together, this was never an issue; you can just run code whenever.
To get back to your example, it works fine if you separate the phases correctly:
(begin-for-syntax
(define-syntax append-to
(ir-macro-transformer
(lambda (e i c)
(let ((var (cadr e))
(val (caddr e)))
`(set! ,var (append ,var ,val)))))) )
(define-syntax something-else
(ir-macro-transformer
(lambda (e i c)
(let ((vals (list 'print)))
(append-to vals '(1))
vals))))
(something-else) ; Expands to (print 1)
If you put the definition of append-to in a module of its own, and you use-for-syntax it, that should work as well. This will also allow you to use the same module both in the macros you define in a body of code as well as in the procedures, by simply requiring it both in a use and a use-for-syntax expression.

Evaluate the arguments of a macro form

What is the best practice for selectively passing evaluated arguments to a macro form?
To elaborate: The usefulness of macros lies in its ability to receives unevaluated parameter, unlike the default evaluation rule for function forms. However, there is a legitimate use cases for evaluating macro arguments.
Consider a contrived example:
(defparameter *func-body* '((print i) (+ i 1)))
Suppose it would be nice that *func-body* could serve as the body of a macro our-defun that is defined as:
(defmacro our-defun (fun args &body body)
`(defun ,fun ,args ,#body))
So after (our-defun foo (i) (+ 1 i)), we could say (foo 1) to get 2. However, if we use (our-defun foo (i) *func-body*), the result of (foo 1) will be ((PRINT I) (+ I 1)) (i.e., the value of *func-body*). It would be nice if we can force the evaluation of *func-body* as an argument to the macro our-defun.
Currently, I can think of a technique of using compile and funcall to do this, as in
(funcall (compile nil `(lambda () (our-defun foo (i) ,#*func-body*))))
after which (our-defun 1) will print out 1 and return 2, as intended. I can think of case of making this work with eval, but I would rather stay away from eval due to its peculiarity in scoping.
This leads to my question at the begining, is there a more straightforward or native way to do this?
P.S.,
A not-so-contrived example is in the function (UPDATE-HOOK), which uses two library macros (ADD-HOOK) and (REMOVE-HOOK) and needs to evaluate its parameters. The (funcall (compile nil `(lambda () ...))) technique above is used here.
(defun update-hook (hook hook-name &optional code)
(funcall (compile nil `(lambda () (remove-hook ,hook ',hook-name))))
(unless (null code)
(compile hook-name `(lambda () ,#code))
(funcall (compile nil `(lambda () (add-hook ,hook ',hook-name))))))
That's slightly confused. A macro does not receive unevaluated parameters.
A macro gets source code and creates source code from that. Remember also that source code in Lisp is actually provided as data. The macro creates code, which evaluates some forms and some not.
Macros need to work in a compiling system. Before runtime. During compile time. All the macro sees is source code and then it creates source code from that. Think of macros as code transformations, not about evaluating arguments or not.
It would be nice if we can force the evaluation of *func-body* as an argument to the macro our-defun
That is not very clean. In a compiled system, you would need to make sure that *func-body* actually has a useful binding and that it can be resolved at COMPILE TIME.
If you have a macro like DEFUN, it makes sense to have the source code static. If you want to insert some source code into a form, then it could make sense to do that at read time:
(defun foo (i) #.`(,#*foo*))
But that's code I usually would want to avoid.
two library macros (ADD-HOOK) and (REMOVE-HOOK) and needs to evaluate its parameters.
Why should ADD-HOOK and REMOVE-HOOK be macros? If you don't have a real reason, they simply should be functions. Already since they make reuse difficult.
If you want to make ADD-HOOK and REMOVE-HOOK macros for some reason, then UPDATE-HOOK usually should be a macro, too.
The list you are giving to your macro has the form
(Quote (...))
So the list you actually want is the CADR of the list you get.