How do defines and syntax macros interact in Racket? - macros

I'm new to Racket and there's something about syntax macros I don't understand. I have these two programs:
This one, which executes correctly:
#lang racket
(define-syntax-rule (create name) (define name 2))
(create x)
(displayln (+ x 3))
And this one, which complains that the identifier x is unbound:
#lang racket
(define-syntax-rule (create) (define x 2))
(create)
(displayln (+ x 3))
With a naive substitution approach (such as C/C++ macros) these two programs would behave identically, but evidently they do not. It seems that identifiers that appear in the invocation of a syntax macro are somehow "special" and defines that use them behave differently to defines that do not. Additionally, there is the struct syntax macro in the Racket standard library which defines several variables that are not explicitly named in its invocation, for example:
(struct employee (first-name last-name))
Will define employee? and employee-first-name, neither of which were directly named in the invocation.
What is going on here, and it can be worked around so that I could create a custom version of struct?

The problem with the naive substitution is unintentional capturing. Racket macros by default are hygienic, which means it avoid this problem. See https://en.wikipedia.org/wiki/Hygienic_macro for more details.
That being said, the macro system supports unhygienic macros too. struct is an example of an unhygienic macro. But you need to put a bit more effort to get unhygienic macros working.
For example, your second version of create could be written as follows:
#lang racket
(require syntax/parse/define)
(define-syntax-parse-rule (create)
#:with x (datum->syntax this-syntax 'x)
(define x 2))
(create)
(displayln (+ x 3))

Related

Is it possible to write a function that would take any macro and turn it into a function so that it can be passed as an argument to another function?

AND and OR are macros and since macros aren't first class in scheme/racket they cannot be passed as arguments to other functions. A partial solution is to use and-map or or-map. Is it possible to write a function that would take arbitrary macro and turn it into a function so that it can be passed as an argument to another function? Are there any languages that have first class macros?
In general, no. Consider that let is (or could be) implemented as a macro on top of lambda:
(let ((x 1))
(foo x))
could be a macro that expands to
((lambda (x) (foo x)) 1)
Now, what would it look like to convert let to a function? Clearly it is nonsense. What would its inputs be? Its return value?
Many macros will be like this. In fact, any macro that could be routinely turned into a function without losing any functionality is a bad macro! Such a macro should have been a function to begin with.
I agree with #amalloy. If something is written as a macro, it probably does something that functions can't do (e.g., introduce bindings, change evaluation order). So automatically converting arbitrary macro into a function is a really bad idea even if it is possible.
Is it possible to write a function that would take arbitrary macro and turn it into a function so that it can be passed as an argument to another function?
No, but it is somewhat doable to write a macro that would take some macro and turn it into a function.
#lang racket
(require (for-syntax racket/list))
(define-syntax (->proc stx)
(syntax-case stx ()
[(_ mac #:arity arity)
(with-syntax ([(args ...) (generate-temporaries (range (syntax-e #'arity)))])
#'(λ (args ...) (mac args ...)))]))
((->proc and #:arity 2) 42 12)
(apply (->proc and #:arity 2) '(#f 12))
((->proc and #:arity 2) #f (error 'not-short-circuit))
You might also be interested in identifier macro, which allows us to use an identifier as a macro in some context and function in another context. This could be used to create a first class and/or which short-circuits when it's used as a macro, but could be passed as a function value in non-transformer position.
On the topic of first class macro, take a look at https://en.wikipedia.org/wiki/Fexpr. It's known to be a bad idea.
Not in the way you probably expect
To see why, here is a way of thinking about macros: A macro is a function which takes a bit of source code and turns it into another bit of source code: the expansion of the macro. In other words a macro is a function whose domain and range are source code.
Once the source code is fully expanded, then it's fed to either an evaluator or a compiler. Let's assume it's fed to a compiler because it makes the question easier to answer: a compiler itself is simply a function whose domain is source code and whose range is some sequence of instructions for a machine (which may or may not be a real machine) to execute. Those instructions might include things like 'call this function on these arguments'.
So, what you are asking is: can the 'this function' in 'call this function on these arguments' be some kind of macro? Well, yes, it could be, but whatever source code it is going to transform certainly can not be the source code of the program you are executing, because that is gone: all that's left is the sequence of instructions that was the return value of the compiler.
So you might say: OK, let's say we disallow compilers: can we do it now? Well, leaving aside that 'disallowing compilers' is kind of a serious limitation, this was, in fact, something that very old dialects of Lisp sort-of did, using a construct called a FEXPR, as mentioned in another answer. It's important to realise that FEXPRs existed because people had not yet invented macros. Pretty soon, people did invent macros, and although FEXPRs and macros coexisted for a while – mostly because people had written code which used FEXPRs which they wanted to keep running, and because writing macros was a serious pain before things like backquote existed – FEXPRs died out. And they died out because they were semantically horrible: even by the standards of 1960s Lisps they were semantically horrible.
Here's one small example of why FEXPRs are so horrible: Let's say I write this function in a language with FEXPRs:
(define (foo f g x)
(apply f (g x)))
Now: what happens when I call foo? In particular, what happens if f might be a FEXPR?. Well, the answer is that I can't compile foo at all: I have to wait until run-time and make some on-the-fly decision about what to do.
Of course this isn't what these old Lisps with FEXPRs probably did: they would just silently have assumed that f was a normal function (which they would have called an EXPR) and compiled accordingly (and yes, even very old Lisps had compilers). If you passed something which was a FEXPR you just lost: either the thing detected that, or more likely it fall over horribly or gave you some junk answer.
And this kind of horribleness is why macros were invented: macros provide a semantically sane approach to processing Lisp code which allows (eventually, this took a long time to actually happen) minor details like compilation being possible at all, code having reasonable semantics and compiled code having the same semantics as interpreted code. These are features people like in their languages, it turns out.
Incidentally, in both Racket and Common Lisp, macros are explicitly functions. In Racket they are functions which operate on special 'syntax' objects because that's how you get hygiene, but in Common Lisp, which is much less hygienic, they're just functions which operate on CL source code, where the source code is simply made up of lists, symbols &c.
Here's an example of this in Racket:
> (define foo (syntax-rules ()
[(_ x) x]))
> foo
#<procedure:foo>
OK, foo is now just an ordinary function. But it's a function whose domain & range are Racket source code: it expects a syntax object as an argument and returns another one:
> (foo 1)
; ?: bad syntax
; in: 1
; [,bt for context]
This is because 1 is not a syntax object.
> (foo #'(x 1))
#<syntax:readline-input:5:10 1>
> (syntax-e (foo #'(x 1)))
1
And in CL this is even easier to see: Here's a macro definition:
(defmacro foo (form) form)
And now I can get hold of the macro's function and call it on some CL source code:
> (macro-function 'foo)
#<Function foo 4060000B6C>
> (funcall (macro-function 'foo) '(x 1) nil)
1
In both Racket and CL, macros are, in fact, first-class (or, in the case of Racket: almost first-class, I think): they are functions which operate on source code, which itself is first-class: you can write Racket and CL programs which construct and manipulate source code in arbitrary ways: that's what macros are in these languages.
In the case of Racket I have said 'almost first-class', because I can't see a way, in Racket, to retrieve the function which sits behind a macro defined with define-syntax &c.
I've created something like this in Scheme, it's macro that return lambda that use eval to execute the macro:
(define-macro (macron m)
(let ((x (gensym)))
`(lambda (,x)
(eval `(,',m ,#,x)))))
Example usage:
;; normal eval
(define x (map (lambda (x)
(eval `(lambda ,#x)))
'(((x) (display x)) ((y) (+ y y)))))
;; using macron macro
(define x (map (macron lambda)
'(((x) (display x)) ((y) (+ y y)))))
and x in both cases is list of two functions.
another example:
(define-macro (+++ . args)
`(+ ,#args))
((macron +++) '(1 2 3))

Mutability of core functions in Racket

I came across a rather unexpected behavior today, and it utterly contradicted what (I thought) I knew about mutability in Racket.
#lang racket
(define num 8)
;(define num 9)
Uncommenting the second line gives back the error "module: duplicate definition for identifier in: num", which is fine and expected. After all, define is supposed to treat already defined values as immutable.
However, this makes no sense to me:
#lang racket
(define num 8)
num
(define define 1)
(+ define define)
It returns 8 and 2, but...
define is not set!, and should not allow the redefinition of something already defined, such as define itself.
define is a core language feature, and is clearly already defined, or I should not be able to use num at all.
What gives? Why is define, which is used to create immutable values, not immutable to itself? What is happening here?
(define define 1)
This example shows shadowing, which is different from mutation.
Shadowing allocates new locations in memory. It does not mutate existing ones.
Concretely, the new define shadows the define from Racket.
All languages with a notation of local scope allow shadowing, eg:
> (define x 10)
> (define (f x) ; x shadowed in function f
(displayln x)
(set! x 2) ; (local) x mutated
(displayln x))
> (f 1)
1
2
; local x is out of scope now
> (displayln x) ; original x unmutated
10
For the other example,
(define num 8)
;(define num 9)
this demonstrates that you cant shadow something within the same scope, which is also standard in other languages, eg:
> (define (g x x) x) ; cant have two parameters named x
When a top-level definition binds an identifier that originates from a macro expansion, the definition captures only uses of the identifier that are generated by the same expansion due to the fresh scope that is generated for the expansion.
In other words, the transformers (macros) that are required from other modules can be re-defined because they're expanded out by the macro expander (so the issue is not quite about mutability) Moreover, since racket is all about extensibility, there are no reserved keywords and additional functionality can be added to the current define through a macro (or a function) - that's why it can be redefined.
define is defined as macro of this form - see here.
#lang racket
(module foo1 racket
(provide foo1)
(define-syntaxes (foo1)
(let ([trans (lambda (syntax-object)
(syntax-case syntax-object ()
[(_) #'1]))])
(values trans))))
; ---
(require 'foo1)
(foo1)
; => 1
(define foo1 9)
(+ foo1 foo1)
; => 18

detecting dialect in Racket

I would like to write code that is correct for both untyped Racket and typed/racket...
For now, I've come up with the following not-so-satisfying principle: I insert a dummy syntax definition for ':' at the beginning of the file, so that the annotations are skipped when using untyped racket. And I comment this definition when using typed/racket.
E.g:
#lang racket
; comment the following line for typed/racket
(define-syntax : (syntax-rules () ((_ id type) (void))))
(: fact (-> Integer Integer))
; the rest of the file is common to both racket and typed/racket
(define (fact n) (if (zero? n) 1 (* n (fact (sub1 n)))))
and
#lang typed/racket
; comment the following line for typed/racket
;(define-syntax : (syntax-rules () ((_ id type) (void))))
(: fact (-> Integer Integer))
; the rest of the file is common to both racket and typed/racket
(define (fact n) (if (zero? n) 1 (* n (fact (sub1 n)))))
This way I just need to remove/insert a ';' to switch between racket and typed/racket...
But is there a way to have a code that doesn't need any change to run in racket and typed/racket? I didn't find how to programmatically detect if I'm in racket or typed/racket... Then I guess I would also have to find a way to conditionally define ':' at the top-level... All this doesn't seem the way to go, so, is there a better way ?
Generally, if you want code to work with both #lang racket and #lang typed/racket, just write your code in #lang typed/racket. When you require a module written in #lang typed/racket from a module written in #lang racket, contracts will automatically be inserted between the two modules to enforce the types. You don’t have to do anything special to require a typed module from an untyped one, so from the user’s perspective, they don’t have to care.
The one area where you might need to worry about being in a typed or untyped context is when writing macros. Often, you can expand to the same code in both situations, but sometimes, you have to do things differently depending on whether or not the target code is typed. For that, you can use define-typed/untyped-identifier, which allows specifying two different forms to be used in different contexts.
If you really need to detect whether or not the current expansion context is typed, you can use the large hammer syntax-local-typed-context?. However, the documentation itself recommends avoiding its use:
This is the nuclear option, provided because it is sometimes, but rarely, useful. Avoid.

For-loop macro in Racket

This macro to implement a C-like for-loop in Lisp is mentioned on this page: https://softwareengineering.stackexchange.com/questions/124930/how-useful-are-lisp-macros
(defmacro for-loop [[sym init check change :as params] & steps]
`(loop [~sym ~init value# nil]
(if ~check
(let [new-value# (do ~#steps)]
(recur ~change new-value#))
value#)))
So than one can use following in code:
(for-loop [i 0 , (< i 10) , (inc i)]
(println i))
How can I convert this macro to be used in Racket language?
I am trying following code:
(define-syntax (for-loop) (syntax-rules (parameterize ((sym) (init) (check) (change)) & steps)
`(loop [~sym ~init value# nil]
(if ~check
(let [new-value# (do ~#steps)]
(recur ~change new-value#))
value#))))
But it give "bad syntax" error.
The snippet of code you have included in your question is written in Clojure, which is one of the many dialects of Lisp. Racket, on the other hand, is descended from Scheme, which is quite a different language from Clojure! Both have macros, yes, but the syntax is going to be a bit different between the two languages.
The Racket macro system is quite powerful, but syntax-rules is actually a slightly simpler way to define macros. Fortunately, for this macro, syntax-rules will suffice. A more or less direct translation of the Clojure macro to Racket would look like this:
(define-syntax-rule (for-loop [sym init check change] steps ...)
(let loop ([sym init]
[value #f])
(if check
(let ([new-value (let () steps ...)])
(loop change new-value))
value)))
It could subsequently be invoked like this:
(for-loop [i 0 (< i 10) (add1 i)]
(println i))
There are a number of changes from the Clojure code:
The Clojure example uses ` and ~ (pronounced “quasiquote” and “unquote” respectively) to “interpolate” values into the template. The syntax-rules form performs this substitution automatically, so there is no need to explicitly perform quotation.
The Clojure example uses names that end in a hash (value# and new-value#) to prevent name conflicts, but Racket’s macro system is hygienic, so that sort of escaping is entirely unnecessary—identifiers bound within macros automatically live in their own scope by default.
The Clojure code uses loop and recur, but Racket supports tail recursion, so the translation just uses “named let”, which is really just some extremely simple sugar for an immediately invoked lambda that calls itself.
There are a few other minor syntactic differences, such as using let instead of do, using ellipses instead of & steps to mark multiple occurrences, the syntax of let, and the use of #f instead of nil to represent the absence of a value.
Finally, commas are not used in the actual use of the for-loop macro because , means something different in Racket. In Clojure, it is treated as whitespace, so it’s totally optional there, too, but in Racket, it would be a syntax error.
A full macro tutorial is well outside the scope of a single Stack Overflow post, though, so if you’re interested in learning more, take a look at the Macros section of the Racket guide.
It’s also worth noting that an ordinary programmer would not need to implement this sort of macro themselves, given that Racket already provides a set of very robust for loops and comprehensions built into the language. In truth, though, they are just defined as macros themselves—there is no special magic just because they are builtins.
Racket’s for loops do not look like traditional C-style for loops, however, because C-style for loops are extremely imperative. On the other hand, Scheme, and therefore Racket, tends to favor a functional style, which avoids mutation and often looks more declarative. Therefore, Racket’s loops attempt to describe higher-level iteration patterns, such as looping through a range of numbers or iterating through a list, rather than low-level semantics like describing how a value should be updated. Of course, if you really want something like that, Racket provides the do loop, which is almost identical to the for-loop macro defined above, albeit with some minor differences.
I want to expand on Alexis's excellent answer a bit. Here's an example usage that demonstrates what she means by do being almost identical to your for-loop:
(do ([i 0 (add1 i)])
((>= i 10) i)
(println i))
This do expression actually expands to the following code:
(let loop ([i 0])
(if (>= i 10)
i
(let ()
(println i)
(loop (add1 i)))))
The above version uses a named let, which is considered the conventional way to write loops in Scheme.
Racket also provides for comprehensions, also mentioned in Alexis's answer, which are also considered conventional, and here's how it'd look like:
(for ([i (in-range 10)])
(println i))
(except that this doesn't actually return the final value of i).
I want to rewrite on Alexis's excellent answer and Chris Jester-Young's excellent answer for people not familiar with let.
#lang racket
(define-syntax-rule (for-loop [var init check change] expr ...)
(local [(define (loop var value)
(if check
(loop change (begin expr ...))
value))]
(loop init #f)))
(for-loop [i 0 (< i 10) (add1 i)]
(println i))

trying to understand require in language extension

I'm trying to define a new language in racket, let's call it wibble. Wibble will allow modules to be loaded so it has to translate it's forms to Racket require forms. But I'm having trouble getting require to work when used in a language extension. I eventually tracked down my problems to the following strange behaviour.
Here's my reader which redefines read and read-syntax
=== wibble/lang/reader.rkt ===
#lang racket/base
(provide (rename-out (wibble-read read) (wibble-read-syntax read-syntax)))
(define (wibble-read in)
(wibble-read-syntax #f in))
(define (wibble-read-syntax src in)
#`(module #,(module-name src) wibble/lang
#,#(read-all src in)))
(define (module-name src)
(if (path? src)
(let-values (((base name dir?) (split-path src)))
(string->symbol (path->string (path-replace-suffix name #""))))
'anonymous-module))
(define (read-all src in)
(let loop ((all '()))
(let ((obj (read-syntax src in)))
(if (eof-object? obj)
(reverse all)
(loop (cons obj all))))))
and here's my much simplified language module, this introduces (require racket/base) into each wibble module
=== wibble/lang.rkt ===
#lang racket/base
(require (for-syntax racket/base))
(provide (rename-out (wibble-module-begin #%module-begin)) #%app #%datum #%top)
(define-syntax wibble-module-begin
(lambda (stx)
(syntax-case stx ()
((_ x ...) #`(#%module-begin (require #,(datum->syntax stx 'racket/base)) x ...)))))
With the above code then this wibble code 'works', i.e. there are no errors
#lang wibble
(cons 1 2)
(cons 3 4)
but the following
#lang wibble
(cons 1 2)
gives error message cons: unbound identifier in module in: cons
Really I'm just looking for an explanation as to what going on. I'm sure the difference is related to this from the racket docs (Racket Reference 3.1)
If a single form is provided, then it is partially expanded in a
module-begin context. If the expansion leads to #%plain-module-begin,
then the body of the #%plain-module-begin is the body of the module.
If partial expansion leads to any other primitive form, then the form
is wrapped with #%module-begin using the lexical context of the module
body; this identifier must be bound by the initial module-path import,
and its expansion must produce a #%plain-module-begin to supply the
module body. Finally, if multiple forms are provided, they are wrapped
with #%module-begin, as in the case where a single form does not
expand to #%plain-module-begin.
but even with that I don't understand why having a single form makes any difference, it's seems to be somthing to do with the timing of partial expansion but I'm not really sure. Nor do I understand why Racket treats a single form as a special case.
Incidentally I can fix the problem with a slight modification to my reader
(define (wibble-read-syntax src in)
#`(module #,(module-name src) wibble/lang
#,#(read-all src in) (void)))
Hard-coding a (void) form means I always have more than one form and eveything works.
Sorry for the long post, I'm just looking for some understanding of how this stuff works.
Alright, I think that I've figured it out.
Your intuition is correct in that the problem lies within the timing of the partial expansion of the single-form module body. Inside of your reader.rkt file, you produce a (module ...) form. As the quoted excerpt from your question states, the forms ... portion of this is then treated specially, since there is only one. Let's take a look at an excerpt from the documentation on partial expansion:
As a special case, when expansion would otherwise add an #%app, #%datum, or #%top identifier to an expression, and when the binding turns out to be the primitive #%app, #%datum, or #%top form, then expansion stops without adding the identifier.
I am almost certain that the partial expansion which occurs at this point does something to the cons identifier. This is the one part that I remain unsure of... my gut tells me that what's happening is that the partial expansion is attempting to find the binding for the cons identifier (since it is the first part of the parentheses, the identifier could be bound to a macro which should be expanded, so that needs to be checked) but is unable to, so it throws a tantrum. Note that even if cons has no phase 1 (syntax-expansion time) binding, the macro expander still expects there to be a phase 0 (runtime) binding for the identifier (among other things, this helps the expander remain hygienic). Because all of this partial expansion happens to the body of your (module ...) form (which is done before your (#%module-begin ...) form where you inject the (#%require ...) form), cons has no binding during the expansion, so the expansion, I believe, fails.
Nevertheless, a naive fix for your problem is to rewrite wibble-read-syntax as follows:
(define (wibble-read-syntax src in)
(let* ((read-in (read-all src in))
(in-stx (and (pair? read-in) (car read-in))))
#`(module #,(module-name src) wibble/lang
(require #,(datum->syntax in-stx 'racket/base))
#,#read-in))
You can then remove the (#%require ...) form from your (#%module-begin ...) macro.
That's not, in my opinion, the best way to fix the issue, however. As a matter of cleanliness, hard-coding in a require form like you've done in wibble/lang.rkt would make Eli Barzilay and co. cry. A much simpler way to do what you are trying to do is by updating your lang.rkt file to something like so:
=== wibble/lang.rkt ===
#lang racket/base
(require (for-syntax racket/base))
(provide (rename-out (wibble-module-begin #%module-begin))
(except-out (all-from-out racket/base) #%module-begin #%app #%datum #%top)
#%app #%datum #%top)
(define-syntax wibble-module-begin
(lambda (stx)
(syntax-case stx ()
((_ x ...) #`(#%module-begin x ...)))))
Writing in this convention removes the need for any hard-coded (require ...) forms and prevents subtle bugs like the one you've unearthed from occuring. If you are confused why this works, remember that you've already provided the #%module-begin identifier using this file, which is subsequently bound in all #lang wibble files. In principle, there is no limit on what identifiers you can bind in this fashion. If you would like some further reading, here's a shameless self-advertisement for a blog post I wrote a little while back on the subject.
I hope I've helped.
The problem is with the require (though I'm not sure I 100% understand all the behavior).
(require X) imports bindings from X with the lexical context of #'X. #'X here has the context of stx, which is the entire #'(module-begin x ...), which is not the context you want. You want the context of one of the cons expressions, i.e., one of the #'xs.
Something like this should work:
(define-syntax wibble-module-begin
(lambda (stx)
(syntax-case stx ()
[(_) #'(#%module-begin)]
[(m x y ...)
#`(#%module-begin
(require #,(datum->syntax #'x 'racket/base))
x y ...)])))
Though, as #belph warned, there's probably a more idiomatic way to accomplish what you want.
The behavior of your original program, and as you intuited, likely has to do with module's different treatment of single and multi sub-forms, but I think the "working" case might be an accident and could be a bug in the racket compiler.