Module meta-language in Racket - racket

I'm trying to write in Racket a module meta-language mylang, which accepts a second language to which is passes the modified body, such that:
(module foo mylang typed/racket body)
is equivalent to:
(module foo typed/racket transformed-body)
where the typed/racket part can be replaced with any other module language, of course.
I attempted a simple version which leaves the body unchanged. It works fine on the command-line, but gives the following error when run in DrRacket:
/usr/share/racket/pkgs/typed-racket-lib/typed-racket/typecheck/tc-toplevel.rkt:479:30: require: namespace mismatch;
reference to a module that is not available
reference phase: 1
referenced module: "/usr/share/racket/pkgs/typed-racket-lib/typed-racket/env/env-req.rkt"
referenced phase level: 0 in: add-mod!
Here's the whole code:
#lang racket
(module mylang racket
(provide (rename-out [-#%module-begin #%module-begin]))
(require (for-syntax syntax/strip-context))
(define-syntax (-#%module-begin stx)
(syntax-case stx ()
[(_ lng . rest)
(let ([lng-sym (syntax-e #'lng)])
(namespace-require `(for-meta -1 ,lng-sym))
(with-syntax ([mb (namespace-symbol->identifier '#%module-begin)])
#`(mb . #,(replace-context #'mb #'rest))))])))
(module foo (submod ".." mylang) typed/racket/base
(ann (+ 1) Number))
(require 'foo)
Requirements (i.e. solutions I'd rather avoid):
Adding a (require (only-in typed/racket)) inside the mylang module makes this work, but I'm interested in a general solution, where mylang does not need to know about typed/racket at al (i.e. if somebody adds a new language foo, then mylang should work with it out of the box).
Also, I'm not interested in tricks which declare a submodule and immediately require and re-provide it, as is done here, because this changes the path to the actual module (so main and test loose their special behaviour, for example).
It is also slower at compile-time, as submodules get visited and/or instantiated more times (this can be seen by writing (begin-for-syntax (displayln 'here)), and has a noticeable impact for large typed/racket programs.
Bonus points if the arrows in DrRacket work for built-ins provided by the delegated-to language, e.g. have arrows from ann, + and Number to typed/racket/base, in the example above.

One thing you can do, which I don't think violates your requirements, is put it in a module, fully expand that module, and then match on the #%plain-module-begin to insert a require.
#lang racket
(module mylang racket
(provide (rename-out [-#%module-begin #%module-begin]))
(define-syntax (-#%module-begin stx)
(syntax-case stx ()
[(_ lng . rest)
(with-syntax ([#%module-begin (datum->syntax #f '#%module-begin)])
;; put the code in a module form, and fully expand that module
(define mod-stx
(local-expand
#'(module ignored lng (#%module-begin . rest))
'top-level
(list)))
;; pattern-match on the #%plain-module-begin form to insert a require
(syntax-case mod-stx (module #%plain-module-begin)
[(module _ lng (#%plain-module-begin . mod-body))
#'(#%plain-module-begin
(#%require lng)
.
mod-body)]))])))
;; Yay the check syntax arrows work!
(module foo (submod ".." mylang) typed/racket/base
(ann (+ 1) Number))
(require 'foo)
And if you wanted to transform the body in some way, you could do that either before or after expansion.
The pattern-matching to insert the extra (#%require lng) is necessary because expanding the module body in a context where lng is available isn't enough. Taking the mod-body code back out of the module form means that the bindings will refer to lng, but lng won't be available at run-time. That's why I get the require: namespace mismatch; reference to a module that is not available error without it, and that's why it needs to be added after expansion.
Update from comments
However, as #GeorgesDupéron pointed out in a comment, this introduces another problem. If lng provides an identifier x and the module where it is used imports a different x, there will be an import conflict where there shouldn't be. Require lines should be in a "nested scope" with respect to the module language so that they can shadow identifiers like x here.
#GeorgesDupéron found a solution to this problem in this email on the racket users list, using (make-syntax-introducer) on the mod-body to produce the nested scope.
(module mylang racket
(provide (rename-out [-#%module-begin #%module-begin]))
(define-syntax (-#%module-begin stx)
(syntax-case stx ()
[(_ lng . rest)
(with-syntax ([#%module-begin (datum->syntax #f '#%module-begin)])
;; put the code in a module form, and fully expand that module
(define mod-stx
(local-expand
#'(module ignored lng (#%module-begin . rest))
'top-level
(list)))
;; pattern-match on the #%plain-module-begin form to insert a require
(syntax-case mod-stx (module #%plain-module-begin)
[(module _ lng (#%plain-module-begin . mod-body))
#`(#%plain-module-begin
(#%require lng)
.
#,((make-syntax-introducer) #'mod-body))]))])))

Related

Racket - How to define a function that can be used both in syntax transformers and ordinary code?

I am using syntax transformers to define macros in Racket. I want to create some helper functions to help me manipulate the syntax. However, the functions I defined outside the syntax transformer are not available inside the syntax transformer. For example, in the following code
(define (my-function x) (+ x 1))
(define-syntax my-macro
(lambda (stx)
(datum->syntax stx (my-function (cadr (syntax->datum stx))))))
I got the error "my-function: reference to an unbound identifier at phase: 1; the transformer environment".
After some searching, I am able to write the following code so that my-function is available inside the syntax transformer.
(begin-for-syntax
(define (my-function x) (+ x 1)))
(provide (for-syntax my-function))
(define-syntax my-macro
(lambda (stx)
(datum->syntax stx (my-function (cadr (syntax->datum stx))))))
But the problem is, my-function is not available outside the syntax transformer this time. Sometimes I want to check those helper functions in ordinary code, so I need to be able to call it from both inside and outside the syntax transformer, just like the function cadr. How can I achieve that?
I know my question has something to do with Racket's syntax model, in particular the concept of "phase level", but I never really understand it. If you could provide some easy-to-follow tutorials explaining it I would even be more grateful.
A common way is to define your function that you want to share across phases in another (sub)module. Then, require it twice.
#lang racket
(module common racket
(provide my-function)
(define (my-function x) (+ x 1)))
(require 'common
(for-syntax 'common))
(define-syntax my-macro
(lambda (stx)
(datum->syntax stx (my-function (cadr (syntax->datum stx))))))
(my-function 1)
(my-macro 123)

How to 'require' a Racket module that doesn't have a #lang header line?

As just one of many possible examples, break-example.rkt would be a perfectly valid Java program, except for the #lang mini-java header that Racket requires.
So e.g. if I've written a Java interpreter/compiler in Racket as a Racket module language, how can I say, "require this file Main.java which is written in module language mini-java but doesn't have any Racket-specific header"?
(Note that I have almost non-zero practical experience with Racket. I'm evaluating this for a specific use case I have for Racket + DrRacket, which has nothing to do with Java by the way. I searched the documentation but couldn't find any way to achieve this.)
I can’t run or test this right now, but maybe you can start from here and experiment with it. The main thing it uses is include/reader:
#lang racket
(require racket/include
syntax/parse/define
(for-syntax racket/syntax
racket/port
syntax/modread))
(define-simple-macro (require/mini-java path)
#:with modname (generate-temporary #'path)
(begin
(include/reader path (mini-java-reader 'modname))
(require 'modname)))
(begin-for-syntax
;; Symbol -> [Any InputPort -> Syntax]
(define ((mini-java-reader modname) src input)
(cond
[(port-closed? input) eof]
[else
(define stx
(with-module-reading-parameterization
(lambda ()
(read-syntax src
(input-port-append #t
(open-input-string "#lang mini-java\n")
input)))))
(close-input-port input)
(syntax-parse stx
[(module _ l . b)
#`(module #,modname l . b)])])))

Racket macro that generates a nested module error

While experimenting with racket's macros, I stumbled into a definition that wasn't at first obvious to me why it was rejected. The code is short and otherwise is probably useless, but is as follows:
#lang racket
(define-syntax (go stx)
(syntax-case stx ()
[(_ id)
#'(module mod racket
(define it id))]
))
(go 'dummy)
The complaint is quote: unbound identifier; also, no #%app syntax transformer...
If I manually inline (define it id) to (define it 'dummy) then it works.
I had a hunch that ' ie. quote of (go 'dummy) that is bound by #lang racket is not recognized as the same binding within the submodule mod even though syntactically it is the same sequence of letters. If I strip 'dummy of all lexical context by round tripping as follows:
(with-syntax ([ok (datum->syntax #f (syntax->datum #'id))])
below the pattern match (_ id) and replace definition of it with (define it ok) then all is good again.
#lang racket
(define-syntax (go stx)
(syntax-case stx ()
[(_ id)
(with-syntax ([ok (datum->syntax #f (syntax->datum #'id))])
#'(module mod racket
(define it ok)))]
))
(go 'dummy)
I presume that my dilemma was caused by the hygiene system. However, is there a more direct solution to convince racket compiler that these identifiers, ie. quote are really the same without this boilerplate?
The expression that you insert for id in:
(module mod racket
(define it id))
is going to be evaluated in the context of the module.
Therefore the syntactic context id id needs to be the same as
the context of the submodule.
You describe one way of removing existing context. Here is another:
#lang racket
(require (for-syntax racket/base))
(define-syntax (go stx)
(syntax-case stx ()
[(_ id)
(with-syntax ([id (syntax->datum #'id)])
#'(module mod racket
(provide it)
(define it id)))]))
(go 42)
(require (submod "." mod))
it
In most macros it is a good thing that context is preserved, so
having to "boiler plate" to remove it seems okay to me.
Of course, if you experience to much boiler plate then write
a macro that inserts the boiler plate for you :-)

See the results of phase 1 computation in phase 0

Suppose I have some module with non-trivial define "override" in Racket. That "override" collects information about the procedure-body and stores it into a map (during the compilation phase). Now I need to use the collected information during the runtime phase. The straightforward aproach doesn`t seem to work:
#lang racket
(require (for-syntax racket))
(define-for-syntax map-that-should-be-used-in-phase-0 (make-hash))
(define-for-syntax (fill-in-useful-information n) (hash-set! map-that-should-be-used-in-phase-0 n n))
; Suppose that some useful information is collected here and stored into a map
(define-syntax (fill-in-map stx)
(begin
(fill-in-useful-information 1)
(fill-in-useful-information 2)
(syntax/loc stx (displayln "OK, the map is filled, but I cannot see it here"))))
(define-syntax (print-that-map stx)
(syntax/loc stx (displayln map-that-should-be-used-in-phase-0))) ; <-- This can not be compiled
(fill-in-map)
(print-that-map)
Can I do it in Racket? If yes then how? Any hints will be greately appreciated!
An identifier referencing a variable cannot be compiled, but the value it refers to can, as long as it's one of the built-in data structures provided by Racket, and as long as it's immutable.
You can stick a hash table value into a syntax object using quasisyntax and unsyntax.
> (quasisyntax (foo #,(hash 'a 4 'b 16)))
#<syntax:5:15 (foo #hash((a . 4) (b . 16)))>
You can do the same thing to communicate one-way from compile-time to run-time.
(define-for-syntax (hash->immutable-hash hsh)
(make-immutable-hash (hash->list hsh)))
(define-syntax (print-that-map stx)
(quasisyntax/loc stx (displayln #,(hash->immutable-hash map-that-should-be-used-in-phase-0))))

trying to understand require in language extension

I'm trying to define a new language in racket, let's call it wibble. Wibble will allow modules to be loaded so it has to translate it's forms to Racket require forms. But I'm having trouble getting require to work when used in a language extension. I eventually tracked down my problems to the following strange behaviour.
Here's my reader which redefines read and read-syntax
=== wibble/lang/reader.rkt ===
#lang racket/base
(provide (rename-out (wibble-read read) (wibble-read-syntax read-syntax)))
(define (wibble-read in)
(wibble-read-syntax #f in))
(define (wibble-read-syntax src in)
#`(module #,(module-name src) wibble/lang
#,#(read-all src in)))
(define (module-name src)
(if (path? src)
(let-values (((base name dir?) (split-path src)))
(string->symbol (path->string (path-replace-suffix name #""))))
'anonymous-module))
(define (read-all src in)
(let loop ((all '()))
(let ((obj (read-syntax src in)))
(if (eof-object? obj)
(reverse all)
(loop (cons obj all))))))
and here's my much simplified language module, this introduces (require racket/base) into each wibble module
=== wibble/lang.rkt ===
#lang racket/base
(require (for-syntax racket/base))
(provide (rename-out (wibble-module-begin #%module-begin)) #%app #%datum #%top)
(define-syntax wibble-module-begin
(lambda (stx)
(syntax-case stx ()
((_ x ...) #`(#%module-begin (require #,(datum->syntax stx 'racket/base)) x ...)))))
With the above code then this wibble code 'works', i.e. there are no errors
#lang wibble
(cons 1 2)
(cons 3 4)
but the following
#lang wibble
(cons 1 2)
gives error message cons: unbound identifier in module in: cons
Really I'm just looking for an explanation as to what going on. I'm sure the difference is related to this from the racket docs (Racket Reference 3.1)
If a single form is provided, then it is partially expanded in a
module-begin context. If the expansion leads to #%plain-module-begin,
then the body of the #%plain-module-begin is the body of the module.
If partial expansion leads to any other primitive form, then the form
is wrapped with #%module-begin using the lexical context of the module
body; this identifier must be bound by the initial module-path import,
and its expansion must produce a #%plain-module-begin to supply the
module body. Finally, if multiple forms are provided, they are wrapped
with #%module-begin, as in the case where a single form does not
expand to #%plain-module-begin.
but even with that I don't understand why having a single form makes any difference, it's seems to be somthing to do with the timing of partial expansion but I'm not really sure. Nor do I understand why Racket treats a single form as a special case.
Incidentally I can fix the problem with a slight modification to my reader
(define (wibble-read-syntax src in)
#`(module #,(module-name src) wibble/lang
#,#(read-all src in) (void)))
Hard-coding a (void) form means I always have more than one form and eveything works.
Sorry for the long post, I'm just looking for some understanding of how this stuff works.
Alright, I think that I've figured it out.
Your intuition is correct in that the problem lies within the timing of the partial expansion of the single-form module body. Inside of your reader.rkt file, you produce a (module ...) form. As the quoted excerpt from your question states, the forms ... portion of this is then treated specially, since there is only one. Let's take a look at an excerpt from the documentation on partial expansion:
As a special case, when expansion would otherwise add an #%app, #%datum, or #%top identifier to an expression, and when the binding turns out to be the primitive #%app, #%datum, or #%top form, then expansion stops without adding the identifier.
I am almost certain that the partial expansion which occurs at this point does something to the cons identifier. This is the one part that I remain unsure of... my gut tells me that what's happening is that the partial expansion is attempting to find the binding for the cons identifier (since it is the first part of the parentheses, the identifier could be bound to a macro which should be expanded, so that needs to be checked) but is unable to, so it throws a tantrum. Note that even if cons has no phase 1 (syntax-expansion time) binding, the macro expander still expects there to be a phase 0 (runtime) binding for the identifier (among other things, this helps the expander remain hygienic). Because all of this partial expansion happens to the body of your (module ...) form (which is done before your (#%module-begin ...) form where you inject the (#%require ...) form), cons has no binding during the expansion, so the expansion, I believe, fails.
Nevertheless, a naive fix for your problem is to rewrite wibble-read-syntax as follows:
(define (wibble-read-syntax src in)
(let* ((read-in (read-all src in))
(in-stx (and (pair? read-in) (car read-in))))
#`(module #,(module-name src) wibble/lang
(require #,(datum->syntax in-stx 'racket/base))
#,#read-in))
You can then remove the (#%require ...) form from your (#%module-begin ...) macro.
That's not, in my opinion, the best way to fix the issue, however. As a matter of cleanliness, hard-coding in a require form like you've done in wibble/lang.rkt would make Eli Barzilay and co. cry. A much simpler way to do what you are trying to do is by updating your lang.rkt file to something like so:
=== wibble/lang.rkt ===
#lang racket/base
(require (for-syntax racket/base))
(provide (rename-out (wibble-module-begin #%module-begin))
(except-out (all-from-out racket/base) #%module-begin #%app #%datum #%top)
#%app #%datum #%top)
(define-syntax wibble-module-begin
(lambda (stx)
(syntax-case stx ()
((_ x ...) #`(#%module-begin x ...)))))
Writing in this convention removes the need for any hard-coded (require ...) forms and prevents subtle bugs like the one you've unearthed from occuring. If you are confused why this works, remember that you've already provided the #%module-begin identifier using this file, which is subsequently bound in all #lang wibble files. In principle, there is no limit on what identifiers you can bind in this fashion. If you would like some further reading, here's a shameless self-advertisement for a blog post I wrote a little while back on the subject.
I hope I've helped.
The problem is with the require (though I'm not sure I 100% understand all the behavior).
(require X) imports bindings from X with the lexical context of #'X. #'X here has the context of stx, which is the entire #'(module-begin x ...), which is not the context you want. You want the context of one of the cons expressions, i.e., one of the #'xs.
Something like this should work:
(define-syntax wibble-module-begin
(lambda (stx)
(syntax-case stx ()
[(_) #'(#%module-begin)]
[(m x y ...)
#`(#%module-begin
(require #,(datum->syntax #'x 'racket/base))
x y ...)])))
Though, as #belph warned, there's probably a more idiomatic way to accomplish what you want.
The behavior of your original program, and as you intuited, likely has to do with module's different treatment of single and multi sub-forms, but I think the "working" case might be an accident and could be a bug in the racket compiler.